[R] POSIXct dates on x-axis using xyplot
I am using 'xyplot' in lattice to plot some data where the x-axis is a POSIXct date. I have data which spans a 6 month period, but when I plot it, only the last month is printed on the right hand side of the axis. I would have expected that at least I would have a beginning and an ending point so that I have a point of reference as to the time that the data spans. Here is some test data. # create test data dates - seq(as.POSIXct('2006-01-03'), as.POSIXct('2006-06-26'), by='1 week') my.data - seq(1, length=length(dates)) require(lattice) [1] TRUE # plot only shows a single month (Jul on the right). Would have # expected at least the beginning and the ending month since this spans # a 6 month period pdf('/test.pdf') xyplot(my.data ~ dates) dev.off() windows 2 sessionInfo() R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: lattice 0.16-5 Sys.info() sysname release Windows NT 5.1 version nodename (build 2600) Service Pack 2 JIM-LAPTOP machine login x86 jim holtman user jim holtman -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitdistr()
I assume that you want to do the fitdistr on one of the columns of the dataframe that you have read in. What does 'str(ONES3)' show? If the data is in the first column, try: fitdistr(ONES3[[1]],chi-squared) On 9/9/07, Terence Broderick [EMAIL PROTECTED] wrote: I am trying to fit the chi-squared distribution to a set of data using the fitdistr function found in the MASS4 library, the data set is called ONES3, I have loaded it using the command ONES3-read.table(ONES3.pdf,header=TRUE,na=NA) I print out the dataset ONES3 to the screen to make sure it has loaded Then I try to fit this data using the command fitdistr fitdistr(ONES3,chi-squared) and it returns the comment Error in fitdistr(ONES3, chi-squared) : 'x' must be a non-empty numeric vector Can anybody help with this, I imagine it is a common mistake for beginners like myself audaces fortuna iuvat - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R first.id last.id function error
This function should do it for you: file1 - read.table(textConnection( id rx week dv1 + 1 1 11 1 + 2 1 12 1 + 3 1 13 2 + 4 2 11 3 + 5 2 12 4 + 6 2 13 1 + 7 3 11 2 + 8 3 12 3 + 9 3 13 4 + 10 4 11 2 + 11 4 12 6 + 12 4 13 5 + 13 5 21 7 + 14 5 22 8 + 15 5 23 5 + 16 6 21 2 + 17 6 22 4 + 18 6 23 6 + 19 7 21 7 + 20 7 22 8 + 21 8 21 9 + 22 9 21 4 + 23 9 22 5), header=TRUE) mark.function - + function(df){ + df - df[order(df$id, df$week),] + # create 'diff' of 'id' to determine where the breaks are + breaks - diff(df$id) + # the first entry will be TRUE, and then every occurance of non-zero in breaks + df$first.id - c(TRUE, breaks != 0) + # the last entry is TRUE and every non-zero breaks + df$last.id - c(breaks != 0, TRUE) + df + } mark.function(file1) id rx week dv1 first.id last.id 1 1 11 1 TRUE FALSE 2 1 12 1FALSE FALSE 3 1 13 2FALSETRUE 4 2 11 3 TRUE FALSE 5 2 12 4FALSE FALSE 6 2 13 1FALSETRUE 7 3 11 2 TRUE FALSE 8 3 12 3FALSE FALSE 9 3 13 4FALSETRUE 10 4 11 2 TRUE FALSE 11 4 12 6FALSE FALSE 12 4 13 5FALSETRUE 13 5 21 7 TRUE FALSE 14 5 22 8FALSE FALSE 15 5 23 5FALSETRUE 16 6 21 2 TRUE FALSE 17 6 22 4FALSE FALSE 18 6 23 6FALSETRUE 19 7 21 7 TRUE FALSE 20 7 22 8FALSETRUE 21 8 21 9 TRUETRUE 22 9 21 4 TRUE FALSE 23 9 22 5FALSETRUE On 9/7/07, Gerard Smits [EMAIL PROTECTED] wrote: Hi R users, I have a test dataframe (file1, shown below) for which I am trying to create a flag for the first and last ID record (equivalent to SAS first.id and last.id variables. Dump of file1: file1 id rx week dv1 1 1 11 1 2 1 12 1 3 1 13 2 4 2 11 3 5 2 12 4 6 2 13 1 7 3 11 2 8 3 12 3 9 3 13 4 10 4 11 2 11 4 12 6 12 4 13 5 13 5 21 7 14 5 22 8 15 5 23 5 16 6 21 2 17 6 22 4 18 6 23 6 19 7 21 7 20 7 22 8 21 8 21 9 22 9 21 4 23 9 22 5 I have written code that correctly assigns the first.id and last.id variabes: require(Hmisc) #for Lags #ascending order to define first dot file1- file1[order(file1$id, file1$week),] file1$first.id - (Lag(file1$id) != file1$id) file1$first.id[1]-TRUE #force NA to TRUE #descending order to define last dot file1- file1[order(-file1$id,-file1$week),] file1$last.id - (Lag(file1$id) != file1$id) file1$last.id[1]-TRUE #force NA to TRUE #resort to original order file1- file1[order(file1$id,file1$week),] I am now trying to get the above code to work as a function, and am clearly doing something wrong: first.last - function (df, idvar, sortvars1, sortvars2) + { + #sort in ascending order to define first dot + df- df[order(sortvars1),] + df$first.idvar - (Lag(df$idvar) != df$idvar) + #force first record NA to TRUE + df$first.idvar[1]-TRUE + + #sort in descending order to define last dot + df- df[order(-sortvars2),] + df$last.idvar - (Lag(df$idvar) != df$idvar) + #force last record NA to TRUE + df$last.idvar[1]-TRUE + + #resort to original order + df- df[order(sortvars1),] + } Function call: first.last(df=file1, idvar=file1$id, sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week)) R Error: Error in as.vector(x, mode) : invalid argument 'mode' I am not sure about the passing of the sort strings. Perhaps this is were things are off. Any help greatly appreciated. Thanks, Gerard [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting lines to sets of points
?segments On 9/7/07, lawnboy34 [EMAIL PROTECTED] wrote: I am using R to plot baseball spray charts from play-by-play data. I have used the following command to plot the diamond: plot (0:250, -250:0, type=n, bg=white) lines(c(125,150,125,100,125),c(-210,-180,-150,-180,-210), col=c(black)) I have also plotted different hit locations using commands such as the following: points(subset(framename$hit_x, framename$hit_traj==line_drive), subset(-framename$hit_y, framename$hit_traj==line_drive), pch=20, col=c(red)) My question: Is there any easy way to plot a line from the origin (home plate) to each point on the graph? Preferably the line would share the same color as the dot that denotes where the ball landed. I have tried searching Google and these forums, and most graphing questions have to do with scatterplots or other varieties of graphs I am not using. Thanks very much in advance. -Jason -- View this message in context: http://www.nabble.com/Plotting-lines-to-sets-of-points-tf4404235.html#a12564704 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove particular elements in a vector
x - answer(100) x - x[!is.na(x)] # remove NAs On 9/7/07, kevinchang [EMAIL PROTECTED] wrote: Hi, Is there any build-in function allowing us to remove a particular group of elements in a vector? For example, if I want to remove all the NA in the output of answer function . Please help. Thanks answer(100) [1] 1 2 NA 4 NA NA 7 8 NA NA 11 NA 13 14 NA 16 17 NA 19 NA NA 22 23 NA NA [26] 26 NA 28 29 NA 31 32 NA 34 NA NA 37 38 NA NA 41 NA 43 44 NA 46 47 NA 49 NA [51] NA 52 53 NA NA 56 NA 58 59 NA 61 62 NA 64 NA NA 67 68 NA NA 71 NA 73 74 NA [76] 76 77 NA 79 NA NA 82 83 NA NA 86 NA 88 89 NA 91 92 NA 94 NA NA 97 98 NA NA -- View this message in context: http://www.nabble.com/remove-particular-elements-in-a-vector-tf4404489.html#a12565480 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with the aggregate command
Your 'lst' is not the same length as either set1 or set2. If one of your columns in the dataframe is the year, then you should have: aggregate(set1, set1$year, median) On 9/7/07, Anup Nandialath [EMAIL PROTECTED] wrote: Dear friends, I have a data set with 23 columns and 38000 rows. It is a panel running from the years 1991 through 2005. I want to aggregate the data and get the medians of each of the 23 columns for each of the years. In other words my output should be like this Year Median 1991123 1992145 1993132 etc. The sample lines of code to do this operation is set1 - subset(as.data.frame(dataset),rep1==1) set2 - subset(as.data.frame(dataset),rep1==0) lst - list(unique(yeara)) y1 - aggregate(set1,lst,median) y2 - aggregate(set2,lst,median) However I'm getting an error as follows Error in FUN(X[[1]], ...) : arguments must have same length Can somebody please help me with what I'm doing wrong here? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list element to matrix
If they are already a matrix in the list, then you don't have to use 'as.matrix'; you can just say: M1 - D[[1]] Now the question is, what do you mean by how do you index M1? Do you want to go through the list applying a function to each matrix? If so, then just 'lapply'. For example, to get the column means, you would do: mean.list - lapply(D, colMeans) Can you explain in a little more detail the problem you are trying to solve. On 9/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I have created a list of matrices using sapply or lapply and wish to extract each of the matrices as a matrix. Some of them are 2x2, 3x3, etc. I can do this one at a time as: M1-as.matrix(D[[1]]) How can repeat this process for an unknown number of entries in the list? In other words, how shall I index M1? Diana __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing elements in list as a data frame
Try this: sls - list(a=matrix(sample(10), ncol=2, dimnames=list(NULL, c('x', 'y'))), + b=matrix(sample(16), ncol=2, dimnames=list(NULL, c('x', 'y' sls $a x y [1,] 8 2 [2,] 9 10 [3,] 4 1 [4,] 5 7 [5,] 3 6 $b x y [1,] 4 14 [2,] 3 15 [3,] 16 5 [4,] 1 9 [5,] 8 7 [6,] 10 2 [7,] 12 13 [8,] 11 6 # create output matrix do.call('rbind', lapply(names(sls), function(.name){ + data.frame(sls[[.name]], Name=.name) + })) x y Name 1 8 2a 2 9 10a 3 4 1a 4 5 7a 5 3 6a 6 4 14b 7 3 15b 8 16 5b 9 1 9b 10 8 7b 11 10 2b 12 12 13b 13 11 6b On 9/5/07, Srinivas Iyyer [EMAIL PROTECTED] wrote: Dear R-helpers, Lists in R are stumbling block for me. I kindly ask you to help me able to write a data-frame. I have a list of lists. sls[1:2] $Andromeda_maya1 x y [1,] 369 103 [2,] 382 265 [3,] 317 471 [4,] 169 465 [5,] 577 333 $Andromeda_maya2 x y [1,] 173 507 [2,] 540 395 [3,] 268 143 [4,] 346 175 [5,] 489 91 I want to be able to write a data.frame like the following: X Y Name 369 103 Andromeda_maya1 382 265 Andromeda_maya1 317 471 Andromeda_maya1 169 465 Andromeda_maya1 577 333 Andromeda_maya1 173 507 Andromeda_maya2 540 395 Andromeda_maya2 268 143 Andromeda_maya2 346 175 Andromeda_maya2 489 91 Andromeda_maya2 Is there a way to convert this list-of-list into a data.frame. Thanks srini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change all . to 0 in a data.frame
Here is one way. You might want to read in the data with 'as.is=TRUE' to prevent conversion to factors. x - data.frame(a=c(1,2,3,'.',5,'.')) str(x) 'data.frame': 6 obs. of 1 variable: $ a: Factor w/ 5 levels .,1,2,3,..: 2 3 4 1 5 1 # replace '.' with zero; either readin with 'as.is=TRUE' or convert to character x$a - as.character(x$a) x$a[x$a == '.'] - '0' x$a - as.numeric(x$a) str(x) 'data.frame': 6 obs. of 1 variable: $ a: num 1 2 3 0 5 0 On 9/5/07, Dieter Best [EMAIL PROTECTED] wrote: Hello, I read in a tab delimited text file via mydata = read.delim(myfile). The text file was originally an excel file where . was used in place of 0. Now all the columns which should be integers are factors. Any ideas how to change all the . to 0 and factors back to integer? Thanks a lot in advance for any suggestions, -- D - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confusion using functions to access the function call stack example section
It is because you have a recursive function call and the value of 'y' when you print is it 0. I have added another statement that might help clarify what you are seeing. At the point at which the most current value of the function 'ggg' is evaluated (last call), the value of 'y' is zero and you are 5 levels down from the 'main frame': gg - function(y) { +cat (gg y=, y, current frame =, sys.nframe(), \n) +ggg - function() { +cat(y = , y, \n) +cat(current frame is , sys.nframe(), \n) +cat(parents are , sys.parents(), \n) +print(sys.function(0)) # ggg +print(sys.function(2)) # gg +} + +if (y 0) gg(y-1) else ggg() + } gg(3) gg y= 3 current frame = 1 gg y= 2 current frame = 2 gg y= 1 current frame = 3 gg y= 0 current frame = 4 y = 0 current frame is 5 parents are 0 1 2 3 4 function() { cat(y = , y, \n) cat(current frame is , sys.nframe(), \n) cat(parents are , sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } environment: 0x01cf5f6c function(y) { cat (gg y=, y, current frame =, sys.nframe(), \n) ggg - function() { cat(y = , y, \n) cat(current frame is , sys.nframe(), \n) cat(parents are , sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } if (y 0) gg(y-1) else ggg() } On 9/4/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: I was going through the example below which is taken from the example section in the R documentation for accessing the function call stack. I am confused and I have 3 questions that I was hoping someone could answer. 1) why is y equal to zero even though the call was done with gg(3) 2) what does parents are 0,1,2,0,4,5,6,7 mean ? I understand what a parent frame is but how do the #'s relate to this particular example ? Why is the current frame # 8 ? 3) it says that sys.function(2) should be gg but I would think that sys.function(1) would be gg since it's one up from where the call is being made. Thanks a lot. If the answers are too complicated and someone knows of a good reference that goes into more details about the sys functions, that's appreciated also. gg - function(y) { ggg - function() { cat(y = , y, \n) cat(current frame is , sys.nframe(), \n) cat(parents are , sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } if (y 0) gg(y-1) else ggg() } gg(3) # OUTPUT y = 0 current frame is 8 parents are 0 1 2 0 4 5 6 7 function() { cat(y = , y, \n) cat(current frame is , sys.nframe(), \n) cat(parents are , sys.parents(), \n) print(sys.function(0)) # ggg print(sys.function(2)) # gg } environment: 0x8a9cc68 function (expr, envir = parent.frame(), enclos = if (is.list(envir) || is.pairlist(envir)) parent.frame() else baseenv()) .Internal(eval.with.vis(expr, envir, enclos)) environment: 0x8974ea0 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Howto sort dataframe columns by colMeans
Here is one way of doing it by 'skipping' the first column which is a factor and your 'time': x - read.table(textConnection( time met-amet-bmet-c + 00:0042 18 99 + 00:0588 16 67 + 00:1080 27 84), header=TRUE) x.mean - colMeans(x[-1]) x.new - x[,c('time', names(sort(x.mean, decreasing=TRUE)))] x.new time met.c met.a met.b 1 00:00994218 2 00:05678816 3 00:10848027 On 9/4/07, Lynn Osburn [EMAIL PROTECTED] wrote: I read from external data source containing several columns. Each column represents value of a metric. The columns are time series data. I want to sort the resulting dataframe such that the column with the largest mean is the leftmost column, descending in colMean values to the right. I see many solutions for sorting rows based on some column characteristic, but haven't found any discussion of sorting columns based on column characteristics. viz. input data looks like this time met-amet-bmet-c 00:0042 18 99 00:0588 16 67 00:1080 27 84 desired output: time met-cmet-a met-b 00:0099 42 18 00:0567 88 16 00:1084 80 27 Thanks, -Lynn -- View this message in context: http://www.nabble.com/Howto-sort-dataframe-columns-by-colMeans-tf4380044.html#a12485729 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame loses name when constructed with one column
Try drop=FALSE: x out pred1 predd2 1 1 2.03.0 2 2 3.55.5 3 3 5.5 11.0 x[,1] [1] 1 2 3 data.frame(x[,1]) x...1. 1 1 2 2 3 3 data.frame(x[,1, drop=FALSE]) out 1 1 2 2 3 3 On 9/4/07, Stan Hopkins [EMAIL PROTECTED] wrote: Not sure why the data.frame function does not capture the name of the column field when its being built with only one column. Can anyone help? data out pred1 predd2 1 1 2.03.0 2 2 3.55.5 3 3 5.5 11.0 data1=data.frame(data[,1]) data1 data...1. 1 1 2 2 3 3 data1=data.frame(data[,1:2]) data1 out pred1 1 1 2.0 2 2 3.5 3 3 5.5 sessionInfo() R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data formatting: from rows to columns
Here is a way using sprintf: x - read.table(textConnection( V2 V3 27 2032567 19 28 2035482 19 126 2472826 19 132 2473320 19 136 2035480 135 145 2062458 135 148 2074927 135 151 2102395 142 156 2027252 142 158 2473082 142)) # output the data cat(sprintf(%d\n%d\n\n, x$V2, x$V3), sep='', file='tempxx.txt') On 8/28/07, Federico Calboli [EMAIL PROTECTED] wrote: Hi All, I have some data I need to write as a file from R to use in a different program. My data comes as a numeric matrix of n rows and 2 colums, I need to transform each row as a two rows 1 col output, and separate the output of each row with a blanck line. Foe instance I need to go from this: V2 V3 27 2032567 19 28 2035482 19 126 2472826 19 132 2473320 19 136 2035480 135 145 2062458 135 148 2074927 135 151 2102395 142 156 2027252 142 158 2473082 142 to 2032567 19 2035482 19 2472826 19 2473320 19 2035480 135 ... Any hint? I seem a bit stuck. cat(unlist(data), file ='data.txt', sep = '\n') (obviously) does not work... Cheers, Fede -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] alternate methods to perform a calculation
I think you can use 'outer' outer(b$xk1, a$x1, function(y,z)abs(z-y)) outer(b$xk2, a$x2, function(y,z)abs(z-y)) On 8/28/07, dxc13 [EMAIL PROTECTED] wrote: Consider a data frame (x) with 2 variables, x1 and x2, having equal values. It looks like: x1 x2 11 22 33 Now, consider a second data frame (xk): xk1 xk2 0.50.5 1.00.5 1.50.5 2.00.5 0.51 1.01 1.51 2.01 0.51.5 1.01.5 1.51.5 2.01.5 0.52 1.02 1.52 2.02 I have written code to calculate some differences between these two data sets; the main idea is to subtract off each element of xk1 from each value of x1, and similarly for xk2 and x2. This is what I have: w1 - array(NA,dim=c(nrow(xk),length(x$x1))) w2 - array(NA,dim=c(nrow(xk),length(x$x2))) for (j in 1:nrow(xk)) { w1[j,] - abs(x$x1-xk$xk1[j]) w2[j,] - abs(x$x2-xk$xk2[j]) } Is there a way to do the above calculation without use of a FOR loop? Thank you Derek -- View this message in context: http://www.nabble.com/alternate-methods-to-perform-a-calculation-tf4344469.html#a12376906 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate mean into a list
try: colMeans(do.call('rbind', lapply(a0, mean))) On 8/28/07, Weiwei Shi [EMAIL PROTECTED] wrote: Dear Listers: I have this task and suppose a0 is a list of 10 data.frames, I want to calculate like this (a0[[1]]+a0[[2]]+..+a[[10]])/10 Thanks. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset question
Here is one way of checking to see if a row contains a particular value and setting the contents of a new column: n - 20 # create test data x - data.frame(sample(letters,n),sample(letters,n),sample(letters,n),sample(letters,n)) # add a column indicating if the row contains 'a', 'b' or 'c' x$a - apply(x[, 1:4], 1, function(.row) any(.row %in% c('a','b','c'))) + 0 On 8/27/07, Kirsten Beyer [EMAIL PROTECTED] wrote: I would like to code records in a dataset with a 1 if any of the columns 9-67 contain a particular code, and zero if they don't. I've been working with subset and it seems that something like subset(data, data[9:67]--12345) would work, but I have been unsuccessful so far. It seems like a simple problem - any help is appreciated! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fill circles
Here is a function that will generate a color sequence for an input vector, You can specify the colors to use, the range and the number of color steps: # specify the colors and the number of increments you want for a specified # range. It will return the colors for the input vector # specify the colors and the number of increments you want for a specified # range. It will return the colors for the input vector f.color - function(input, # input vector colors=c('green','yellow','red'), # desired colors input.range=c(0,0.01), # range of input to create colors input.steps=10) # number of increments { myColors - colorRampPalette(colors)(input.steps) # generate colors myColors[cut(input, seq(input.range[1], input.range[2], length=input.steps+1), labels=FALSE, include.lowest=TRUE)] } # generate a legend to show colors plot.new() # create blank plot x - round(runif(15), 3) legend('topleft', legend=x, fill=f.color(x, input.range=c(0,1))) legend('topright', legend=x, fill=f.color(x, input.range=c(0,1), colors=c('purple','red','blue','orange'))) legend('top', legend=x, fill=f.color(x, input.range=c(0,1), colors=c('red','yellow','green'))) So you should be able to use something like this. On 8/25/07, Cristian cristian [EMAIL PROTECTED] wrote: Hi all, I'm an R newbie, I did this script to create a scatterplot using the tree matrix from datasets package: library('datasets') with(trees, { plot(Height, Volume, pch=3, xlab=Height, ylab=Volume) symbols(Height, Volume, circles=Girth/12, fg=grey, inches=FALSE, add=FALSE) } ) I'd like to use the column Named Height to fill the circles with colors (ex.: the small numbers in green then yellow and the high numbers in red). I'd like to have a legend for the size and the colors too. I did it manually using a script like that: color[(x=0.001)(x0.002)]-#41FF41 color[(x=0.002)(x0.003)]-#2BFF2B color[(x=0.003)(x0.004)]-#09FF09 color[(x=0.004)(x0.005)]-#00FE00 color[(x=0.005)(x0.006)]-#00F700 color[(x=0.006)(x0.007)]-#00E400 color[(x=0.007)(x0.008)]-#00D600 color[(x=0.008)(x0.009)]-#00C300 and so on but I don't like to do it manually... do know a solution... Thank you very much chris [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric : what goes wrong?
Do an 'str' on the vector. Are you sure it is not a 'factor'? Try: as.numeric(as.character(j1[1])) On 8/24/07, Wolfgang Polasek [EMAIL PROTECTED] wrote: I have a character vector j1 created from dimnames and want it to convert it to numeric. Like the first element: j1[1] f896 1 896 as.numeric(j1[1]) [1] 1990 why is it not 896 as it should be? This is true fr the whole vector. Thanks W.P. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need a variant of rbind for datasets with different numbers of columns
Where is the data coming from since it has a variable number of columns in each row? Is it coming from a text file? If so, you can use the fill=TRUE option when reading to fill out empty columns. You need to provide at least a subset of the data so we can see what you are working with. On 8/22/07, Kirsten Beyer [EMAIL PROTECTED] wrote: Hello. I am looking for a function that will allow me to paste rows together without regard for the numbers of columns in the datasets to be joined. The only columns where it matters if they are aligned correctly are at the beginning - the rest of the columns represent differing numbers of ICD9 (disease) codes reported by each person(record) at a health visit. They are in no particular order. For example, a result would look like this: patient ICD91 ICD92 ICD93 patient A 12345 67891543 patient B3469 9090 patient C 1234 I am trying to accomplish this inside a loop which first identifies the codes associated with the person and then joins them to the person. I have the code working so that it can create a row for each person, but I can't figure out how to join these rows together! FYI, my dataset has 200,000+ people. Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rectify a program of seasonal dummies matrix
Your syntax is wrong; e.g., if i==j should be if (i == j) same with your use of 'if else'. You need to use the correct syntax. Your example is hard to follow without the correct indentation since you are using the incorrect syntax. On 8/21/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi friends, I would like to construct a matrix of seasonal dummies with number of rows (observations)=100. such matrix is written as follows:[1 0 0 0;0 1 0 0;0 0 1 0;0 0 0 1;1 0 0 0;0 1 0 0;0 0 1 0;0 0 0 1;etc...] . I wrote the following program: T=100 br=matrix(0,T,4) { for (i in 1:T) for (j in 1:4) if i==j br[i,j]=1 if else (abs(i-j)%%4==0 br[i,j]=1 else br[i,j]=0 } z-br z but unfortunately I obtained from the console the following message: { + for (i in 1:T) + for (j in 1:4) + (if i==j) Erreur : syntax error, unexpected SYMBOL, expecting '(' dans : br[i,j]=1 Erreur dans br[i, j] = 1 : objet i non trouvé (if else (abs(i-j)%%4==0) Erreur : syntax error, unexpected ELSE, expecting '(' dans (if else br[i,j]=1 Erreur dans br[i, j] = 1 : objet i non trouvé else Erreur : syntax error, unexpected ELSE dans else br[i,j]=0 Erreur dans br[i, j] = 0 : objet i non trouvé } Erreur : syntax error, unexpected '}' dans } Can you please rectify my smal program, I tried to rectify it but I can't. Many thanks in advance. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to parse a string into the symbol for a data frame object
One way to do it is to pass in the character name of the dataframe you want to reference and then use 'get' to access the value: e.g., df1 - data.frame(x=seq(0,10), y=seq(10,20)) df2 - data.frame(a=seq(0,10), b=seq(10,20)) # use the character names for referencing for (df in c('df1', 'df2')){ # get the data to operate on (read-only) .val - get(df) # now you can reference the object print(names(.val)) # or construct new objects to store the value in # or you can use assign' to store back in the original object assign(paste('temp.', df, sep=''), .val) } On 8/19/07, Darren Weber [EMAIL PROTECTED] wrote: I have several data frames, eg: df1 - data.frame(x=seq(0,10), y=seq(10,20)) df2 - data.frame(a=seq(0,10), b=seq(10,20)) It is common to create loops in R like this: for(df in list(df1, df2)){ #etc. } This works fine when you know the name of the objects to put into the list. I assume that the order of the objects in the list is respected through the loop. Inside the loop, the objects of the list are 'dereferenced' using 'df' but, to my knowledge, there is no way to tell whether 'df' is a current representation of 'df1' or 'df2' without some additional book keeping. In addition, I really want to use 'paste' within the loop to create a new string value that will have the symbol name of a data frame to be dereferenced, e.g.: for(n in c(1, 2)){ dfString - paste('df', n, sep=); print(eval(dfString)) } [1] df1 [1] df2 This is not what I want. I have read through the documentation on eval and similar commands like substitute and quote. I program regularly, but I do not understand these constructs in R. I do not understand the R framework for parsing and evaluation and I don't have a lot of time right now to get lost in this detail. I could really use some help to get the string values in my loop to be parsed into symbols that refer to the data frame objects df1 and df2. How is this done? Best, Darren __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matching elements from two vectors
x - c(1,2,1,1,3,5,3,3,1) y - c(2,3) intersect(x,y) [1] 2 3 On 8/17/07, Gonçalo Ferraz [EMAIL PROTECTED] wrote: Hi, Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y with elements (2,3). I need to find out what elements of x match any of the elements of y. Is there a simple command that will return a vector with elements (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with dataframe colums as well. I have tried x==y and it doesn't work. x==any(y) doesn't work either. I realize I could write a foor loop and go through each element of y asking if it matches any element of x, but isn't there a shorter way? Thanks, Gonçalo [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matching elements from two vectors
Also if you want all the matches x[x %in% y] [1] 2 3 3 3 On 8/17/07, Gonçalo Ferraz [EMAIL PROTECTED] wrote: Hi, Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y with elements (2,3). I need to find out what elements of x match any of the elements of y. Is there a simple command that will return a vector with elements (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with dataframe colums as well. I have tried x==y and it doesn't work. x==any(y) doesn't work either. I realize I could write a foor loop and go through each element of y asking if it matches any element of x, but isn't there a shorter way? Thanks, Gonçalo [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparison of arrays of strings
Read them into 2 different vectors and then use 'intersect'. On 8/17/07, ramakanth reddy [EMAIL PROTECTED] wrote: Hi i have two arrays of genes names,one with18 gene names and the other with 24000 gene names,I have to compare both of them for finding common names. I have both the arrays in .csv format.i loaded the files and tried to compare them using for and if loops but I got the error Error in Ops.factor(cgh[i, 1], cgh[j, 2]) : level sets of factors are different Please suggest me how to solve this problem or any other alternative procedure Thanks ramakanth Get the freedom to save as many mails as you wish. To know how, go to http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for plots
Turn 'Recording on for the plots. windows(record=TRUE) or select from the GUI. On 8/17/07, Brad Zhang [EMAIL PROTECTED] wrote: Hi, All, I am a beginner for R. Now I have installed R 2.5.1 in Window environment. After I run a program such as gam I would like to display a plot for the object. The following is an example. When I did this, only the last plot was presented on my screen. How can I get a plot before the last plot? I mean if the object has several plots how can I get those? gam.object - gam(y ~ s(x,6) + z,data=gam.data) plot(gam.object,se=TRUE) Thank you. Brad. Dr. Guicheng (Brad) Zhang Senior Research Officer School of Paediatrics and Child Health Telethon Institute for Child Health Research 100 Roberts Road, Subiaco Western Australia, 6008 AUSTRALIA Email: [EMAIL PROTECTED] Phone: 93407896 Fax: 93882097 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write to a table column by column?
Assuming that the daily.incomes are the same lengths, then your loop could be: Lst - list() for (i in 1:count) Lst[[i]] - list(..) Lst.col - do.call('cbind', Lst) On 8/12/07, Yuchen Luo [EMAIL PROTECTED] wrote: Dear friends. Every loop of my program will result in a list that is very long, with a structure similar to the one below: Lst - list(name=Fred, wife=Mary, daily.incomes=c(1:850)) Please notice the large size of daily.incomes. I need to store all such lists in a csv file so that I can easily view them in Excel. Excel cannot display a row of more than 300 elements, therefore, I have to store the lists as columns. It is not hard to store one list as a column in the csv file. The problem is how to store the second list as a second column, so that the two columns will lie side by side to each other and I can easily compare their elements. ( If I use 'appened=TRUE', the second time series will be stored in the same column. ) Thank you for your tine and your help will be highly appreciated!! Best Yuchen Luo [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extract part of vector
This should do it: txt [1] \nhttp://www.mysite.com/system/empty.asp?P=2VID=defaultSID=421384237289476S=1C=18631; [2] \nhttp://www.mysite.com/system/empty.asp?P=123VID=defaultSID=421384237289476S=1C=18643; [3] \nhttp://www.mysite.com/system/empty.asp?P=342VID=defaultSID=421384237289476S=1C=18634\n; [4] \nhttp://www.mysite.com/system/empty.asp?P=232VID=defaultSID=421384237289476S=1C=18645; [5] \nhttp://www.mysite.com/system/empty.asp?P=2345VID=defaultSID=421384237289476S=1C=18254; [6] \nhttp://www.mysite.com/system/empty.asp?P=257654VID=defaultSID=421384237289476S=1C=18732; [7] \nhttp://www.mysite.com/system/empty.asp?P=22VID=defaultSID=421384237289476S=1C=18637; [8] \nhttp://www.mysite.com/system/empty.asp?P=2463VID=defaultSID=421384237289476S=1C=18575\n; gsub(^.*asp.P=([[:digit:]]+).*$, '\\1', txt) [1] 2 1233422322345 257654 22 2463 On 8/13/07, Lauri Nikkinen [EMAIL PROTECTED] wrote: Dear R-users, How do I extract numbers between asp?P= and VID from my txt vector? I have tried grep function with no luck. txt - c( http://www.mysite.com/system/empty.asp?P=2VID=defaultSID=421384237289476S=1C=18631;, http://www.mysite.com/system/empty.asp?P=123VID=defaultSID=421384237289476S=1C=18643;, http://www.mysite.com/system/empty.asp?P=342VID=defaultSID=421384237289476S=1C=18634 , http://www.mysite.com/system/empty.asp?P=232VID=defaultSID=421384237289476S=1C=18645;, http://www.mysite.com/system/empty.asp?P=2345VID=defaultSID=421384237289476S=1C=18254;, http://www.mysite.com/system/empty.asp?P=257654VID=defaultSID=421384237289476S=1C=18732;, http://www.mysite.com/system/empty.asp?P=22VID=defaultSID=421384237289476S=1C=18637;, http://www.mysite.com/system/empty.asp?P=2463VID=defaultSID=421384237289476S=1C=18575 ) The result should be like 2 123 342 232 2345 257654 22 2463 Thanks, Lauri [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] invert 160000x160000 matrix
You would need 200GB to store a since image, so if you have about 1TB of physical memory on your computer, it might be possible. On 8/13/07, Jiao Yang [EMAIL PROTECTED] wrote: Can R invert a 16x16 matrix with all positive numbers? Thanks a lot! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Write values on y axe
Does this do what you want: x - runif(10) plot(x) # put min/max in red axis(2, at=round(range(x), 4), col.axis='red', las=2) On 8/12/07, akki [EMAIL PROTECTED] wrote: Hi, I have values on y axe from 0.0001 to 3.086. When I do plot I have writen values: 0.001, 0.050,1.000 ..., but how I can write on graph the minimum value and maximum value, with all decimals (I don't want to use the format 1e-0x)? I am using log scale. For example, if I have the values: 0.0001 0.0015 0.0256 0.0236 0.0201 2.9668 3.0086 I need have each 'x' value put on y axe, and add the value minimum and maximum on my graph. How can I do it? I do: plot(o$a, log=y, type=l, col=colors[1], xlab=a_x, ylab=a_y, cex.lab=0.8) lines(o$b, type=l, pch=1, lty=1, col=colors[2]) lines(o$c, type=l, pch=2, lty=2, col=colors[3]) to I draw my graph. Thanks in advance. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Legend on graph
If you are asking to have the values plotted on top of the legend, then you can do the following: plot(x, y, type='n', ...) # create plot, but don't plot legend('topright', ...) lines(x,y) # now plot the data If you want it outside the plot, check the archives for several examples. On 8/12/07, akki [EMAIL PROTECTED] wrote: Hi, I have a problem when I want to put a legend on the graph. I do: legend(topright, names(o), cex=0.9, col=plot_colors,lty=1:5, bty=n) but the legend is writen into the graph (graphs' top but into the graph), because I have values on this position. How can I write the legend on top the graph without the legend writes on graph's values. Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to control the number format on plot axes ?
Here is a way that you can put the formatting that you want; you were not clear on exactly what you were after. You can setup the 'labels' argument for whatever you want. a-1:10 myTicks-c(0.1,1,2,5,10) # set ylim to range of myTicks that you want plot(x=a,y=a,log=y,type=p,yaxt=n, ylim=range(myTicks)) # change the sprintf to whatever formatting you want axis(side=2,at=myTicks, labels=ifelse(myTicks = 1, sprintf(%.0f, myTicks), sprintf(%0.1f, myTicks))) On 8/12/07, Sébastien [EMAIL PROTECTED] wrote: Dear R-users, Basically, everything is in the title of my e-mail. I know that some threads from the archives have already addressed this question but they did not really give a clear solution. Here is a series of short codes that will illustrate the problem: # First a-1:10 plot(x=a,y=a,log=y,type=p) # Second a-1:10 myTicks-c(1,2,5,10) plot(x=a,y=a,log=y,type=p,yaxt=n) axis(side=2,at=myTicks) # Third a-1:10 myTicks-c(0.1,1,2,5,10) plot(x=a,y=a,log=y,type=p,yaxt=n) axis(side=2,at=myTicks) # Forth a-0.1:10 plot(x=a,y=a,log=y,type=p) In the first and second examples, the plots are identical and the tick labels are 1, 2, 5 and 10. In the third, the labels are number in the x.0 format (1.0, 2.0, 5.0 and 10.0), even if there is no point below 1. The only reason I see is because the first element of myTicks is 0.1. And, the forth example is self-explanatory. Interestingly, the 'scales' argument of xyplot in the lattice package do not add these (unnecessary) decimals on labels greater than 1. Do you know how I could transpose the behavior of the lattice 'scales' argument to the 'axis' function ? Thank you PS: No offense, but please don't suggest I use lattice. I have to go for base R graphics in my full-scale project (it is a speed issue). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace NAs in dataframe: what am I doing wrong
The problem is that the first column is probably a factor and you are trying to assign a value that is not already a 'level' in the factor. One way is to read the data with as.is=TRUE to keep it as character, replace the NAs and then convert back to factors if you want to: x - read.csv(textConnection(A,B + a,3 + b,4 + .,. + c,5), na.strings='.', as.is=TRUE) # keep as character # replace NAs x[is.na(x[,1]), 1] - Missing Value # convert back to factors if you want to x[[1]] - factor(x[[1]]) str(x) 'data.frame': 4 obs. of 2 variables: $ A: Factor w/ 4 levels a,b,c,Missing Value: 1 2 4 3 $ B: int 3 4 NA 5 On 8/11/07, Sébastien [EMAIL PROTECTED] wrote: Dear R-users, My script imports a dataset from a csv file, in which missing values are represented by .. This importation is done into a dataframe using the read.table function with na.strings = . Then I want to replace the NAs in the first column of the dataframe by Missing data. I am using the following code to do so : mydata-data.frame(read.table(myFile,sep=,,header=TRUE,na.strings=.)) # myFile is the full path of the source file mydata[,1][is.na(mydata[,1])]-Missing value This code works perfectly fine if this first column contains only missing values, i.e. .. As soon as it contains multiple levels and missing values, things start to get wrong. I get the following error message and the replacement is not done. Warning message: invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, is.na(mydata[, 1]), value = Missing value) Is there an error in my code or is that a bug (I doubt about it) ? Thanks in advance. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shell and shell.exec on Windows
If you are using Windows, then try: system('cmd /c yourfile.xls') This will invoke the windows command processor and it should pick the correct association. On 8/11/07, Erich Neuwirth [EMAIL PROTECTED] wrote: Thanks Gabor, system() indeed would be the answer, but it does not solve my problem because of some inconsistencies in WindowsXP. I will explain the story, because perhaps it can help somebody else to avoid wasting time. On my machine, when I doubleclick an .xlsm file, it is opened in Excel 2007. .xls files are opened in Excel 2003. shell.exec(file.xls) and shell.exec(file.xlsm) also open the files in Excel 2003 and Excel 2007 respectively. system() does not invoke a shell, so I need to find the application associated with Excel to create a string with the name of the application and the name of the file to open. Then, something like system(\c:\\mypath\\CorrectVersionOfExcel.exe\ \c:\\mydir\\myexcelfile.xlsm\) should work (and run the program invisibly) There are two helpful shell commands in WinXP ASSOC and FTYPE ASSOC .xls .xls=Excel.Sheet.8 ASSOC .xlsm .xlsm=Excel.SheetMacroEnabled.12 ftype Excel.Sheet.8 Excel.Sheet.8=C:\Program Files\Microsoft Office\OFFICE11\EXCEL.EXE /e ftype Excel.SheetMacroEnabled.12 Excel.SheetMacroEnabled.12=C:\PROGRA~2\MICROS~2\OFFICE11\EXCEL.EXE /e So despite the fact that doubleclicking .xlsm files or using shell.exec opens Excel 2007 the application reported by assoc and ftype for .xlsm files is Excel 2003. Gabor Grothendieck wrote: The system() function has an invisible= argument. The ryacas package uses system() to run yacas. See the runYacas() and yacasInvokeString() functions in yacas.R for examples: http://ryacas.googlecode.com/svn/trunk/R/yacas.R On 8/11/07, Erich Neuwirth [EMAIL PROTECTED] wrote: I have an Excel workbook MyWorkbook.xls containing an Auto_Open macro which I want to be run from R. shell.exec(MyWorkbook.xls) does that. shell(start MyWorkbook.xls) also runs it. In both cases, the Excel window is visible on screen when Excel is started. Is there a way of opening the sheet with a hidden Excel window? start has some parameters (e.g. /MIN), which should allow this, but shell(start /MIN MyWorkbook.xls) also starts Excel visibly. -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help wit matrices
Is this what you want: x - matrix(runif(100), 10) round(x, 3) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226 [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131 [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982 [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327 [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507 [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681 [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099 [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119 [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050 [10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929 ifelse(x .5, 1, 0) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]010001100 0 [2,]000011001 0 [3,]110010001 1 [4,]011101100 0 [5,]011000100 1 [6,]101100000 1 [7,]110100000 0 [8,]010100100 0 [9,]010000100 0 [10,]101011010 1 On 8/10/07, Lanre Okusanya [EMAIL PROTECTED] wrote: Hello all, I am working with a 1000x1000 matrix, and I would like to return a 1000x1000 matrix that tells me which value in the matrix is greater than a theshold value (1 or 0 indicator). i have tried mat2-as.matrix(as.numeric(mat10.25)) but that returns a 1:10 matrix. I have also tried for loops, but they are grossly inefficient. THanks for all your help in advance. Lanre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Countvariable for id by date
This should do what you want: x - read.table(textConnection(id;dg1;dg2;date; + 1;F28;;1997-11-04; + 1;F20;F702;1998-11-09; + 1;F20;;1997-12-03; + 1;F208;;2001-03-18; + 2;F32;;1999-03-07; + 2;F29;F32;2000-01-06; + 2;F32;;2003-07-05; + 2;F323;F2800;2000-02-05;), header=TRUE, sep=;, as.is=TRUE) # convert dates x$dateP - unclass(as.POSIXct(x$date)) # matches for F20 F20 - grep(F20, paste(x$dg1, x$dg2)) # matches for F21 - F29 F21 - grep(F2[1-9], paste(x$dg1, x$dg2)) # grouping x$F20 - x$F21 - NA x$F20[F20] - rank(x$dateP[F20]) x$F21[F21] - rank(x$dateP[F21]) x id dg1 dg2 date X dateP F21 F20 1 1 F28 1997-11-04 NA 878601600 1 NA 2 1 F20 F702 1998-11-09 NA 910569600 NA 2 3 1 F20 1997-12-03 NA 881107200 NA 1 4 1 F208 2001-03-18 NA 984873600 NA 3 5 2 F32 1999-03-07 NA 920764800 NA NA 6 2 F29 F32 2000-01-06 NA 947116800 2 NA 7 2 F32 2003-07-05 NA 1057363200 NA NA 8 2 F323 F2800 2000-02-05 NA 949708800 3 NA On 8/9/07, David Gyllenberg [EMAIL PROTECTED] wrote: Best R-users, Here's a newbie question. I have tried to find an answer to this via help and the ave(x,factor(),FUN=function(y) rank (z,tie='first')-function, but without success. I have a dataframe (~8000 observations, registerdata) with four columns: id, dg1, dg2 and date(-MM-DD) of interest: id;dg1;dg2;date; 1;F28;;1997-11-04; 1;F20;F702;1998-11-09; 1;F20;;1997-12-03; 1;F208;;2001-03-18; 2;F32;;1999-03-07; 2;F29;F32;2000-01-06; 2;F32;;2003-07-05; 2;F323;F2800;2000-02-05; ... I would like o have two additional columns: 1. countF20: a countvariable that shows which in order (by date) the id has if it fulfils the following logical expression: dg1 = F20* OR dg2 = F20*, where * means F201,F202... F2001,F2002...F20001,F20002... 2. countF2129: another countvariable that shows which in order (by date) the id has if it fulfils the following logical expression: dg1 = F21*-F29* OR dg2 = F21*-F29*, where F21*-F29* means F21*, F22*...F29* and where * means F211,F212... F2101,F2102...F21001,F21002... ... so the dataframe would look like this, where 1 is the first observation for the id with the right condition, 2 is the second etc.: id;dg1;dg2;date;countF20;countF2129; 1;F28;;1997-11-04;;1; 1;F20;F702;1998-11-09;2;; 1;F20;;1997-12-03;1;; 1;F208;;2001-03-18;3;; 2;F32;;1999-03-07;;; 2;F29;F32;2000-01-06;;1; 2;F32;;2003-07-05;;; 2;F323;F2800;2000-02-05;;2; ... Do you know a convenient way to create these kind of countvariables? Thank you in advance! / David (david.gyllenberg at yahoo.com - Park yourself in front of a world of choices in alternative vehicles. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot table with sapply - labeling problems
Here is a modified script that should work. In many cases where you want the names of the element of the list you are processing, you should work with the names: test-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),round(runif(50,0,4 sapply(test, table)-vardist sapply(test, function(x) round(table(x)/sum(table(x))*100,1) )-vardist1 par(mfrow=c(1,3)) # you need to use the 'names' and then index into the variable # your original 'x' did not have a names associated with it sapply(names(vardist1), function(x) barplot(vardist1[[x]], ylim=c(0,100),main=Varset1,xlab=x)) par(mfrow=c(1,1)) On 8/9/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi List, I am trying to label a barplot group with variable names when using sapply unsucessfully. I can't seem to extract the names for the indiviual plots: test-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),roun d(runif(50,0,4 sapply(test, table)-vardist sapply(test, function(x) round(table(x)/sum(table(x))*100,1) )-vardist1 par(mfrow=c(1,3)) sapply(vardist1, function(x) barplot(x, ylim=c(0,100),main=Varset1,xlab=names(x))) par(mfrow=c(1,1)) Names don't show up although names(vardist) works. Also I would like to put a single Title on this plot instead of repeating Varset three times. Any hints appreciated. Thanx Herry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting by number of observations in a factor
Does this do what you want? It creates a new dataframe with those 'mg' that have at least a certain number of observation. set.seed(2) # create some test data x - data.frame(mg=sample(LETTERS[1:4], 20, TRUE), data=1:20) # split the data into subsets based on 'mg' x.split - split(x, x$mg) str(x.split) List of 4 $ A:'data.frame': 7 obs. of 2 variables: ..$ mg : Factor w/ 4 levels A,B,C,D: 1 1 1 1 1 1 1 ..$ data: int [1:7] 1 4 7 12 14 18 20 $ B:'data.frame': 3 obs. of 2 variables: ..$ mg : Factor w/ 4 levels A,B,C,D: 2 2 2 ..$ data: int [1:3] 9 15 19 $ C:'data.frame': 4 obs. of 2 variables: ..$ mg : Factor w/ 4 levels A,B,C,D: 3 3 3 3 ..$ data: int [1:4] 2 3 10 11 $ D:'data.frame': 6 obs. of 2 variables: ..$ mg : Factor w/ 4 levels A,B,C,D: 4 4 4 4 4 4 ..$ data: int [1:6] 5 6 8 13 16 17 # only choose subsets with at 5 observations x.5 - lapply(x.split, function(a) { + if (nrow(a) = 5) return(a) + else return(NULL) + }) # create new dataframe with these observations x.new - do.call('rbind', x.5) x.new mg data A.1 A1 A.4 A4 A.7 A7 A.12 A 12 A.14 A 14 A.18 A 18 A.20 A 20 D.5 D5 D.6 D6 D.8 D8 D.13 D 13 D.16 D 16 D.17 D 17 On 8/9/07, Ron Crump [EMAIL PROTECTED] wrote: Hi, I generally do my data preparation externally to R, so I this is a bit unfamiliar to me, but a colleague has asked me how to do certain data manipulations within R. Anyway, basically I can get his large file into a dataframe. One of the columns is a management group code (mg). There may be varying numbers of observations per management group, and he would like to subset the dataframe such that there are always at least n per management group. I presume I can get to this using table or tapply, then (and I'm not sure how on this bit) creating a column nmg containing the number of observations that corresponds to mg for that row, then simply subsetting. So, am I on the right track? If so how do I actually do it, and is there an easier method than I am considering. Thanks for your help, Ron __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting by number of observations in a factor
Here is an even faster way: # faster way x.mg.size - table(x$mg) # count occurance x.mg.5 - names(x.mg.size)[x.mg.size 5] # select greater than 5 x.new1 - subset(x, x$mg %in% x.mg.5) # use in the subset x.new1 mg data 1 A1 4 A4 5 D5 6 D6 7 A7 8 D8 12 A 12 13 D 13 14 A 14 16 D 16 17 D 17 18 A 18 20 A 20 On 8/9/07, Ron Crump [EMAIL PROTECTED] wrote: Jim, Does this do what you want? It creates a new dataframe with those 'mg' that have at least a certain number of observation. Looks good. I also have an alternative solution which appears to work, so I'll see which is quicker on the big data set in question. My solution: mgsize - as.data.frame(table(in$mg)) in2 - merge(in,mgsize,by.x=mg,by.y=Var1) out - subset(in2, Freq 1, select= -Freq) Thanks for your help. Ron. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert decimal date to its equivalent date format(YYYY.mm.dd.hr.min.sec)
Is this what you want? x - scan(textConnection(1979.00 + + 1979.020833 + + 1979.041667 + + 1979.062500), what=0) Read 4 items # get the year and then determine the number of seconds in the year so you can # use the decimal part of the year x.year - floor(x) # fraction of the year x.frac - x - x.year # number of seconds in each year x.sec.yr - unclass(ISOdate(x.year+1,1,1,0,0,0)) - unclass(ISOdate(x.year,1,1,0,0,0)) # now get the actual time x.actual - ISOdate(x.year,1,1,0,0,0) + x.frac * x.sec.yr x.actual [1] 1979-01-01 00:00:00 GMT 1979-01-08 14:29:49 GMT 1979-01-16 05:00:10 GMT [4] 1979-01-23 19:30:00 GMT On 8/7/07, Yogesh Tiwari [EMAIL PROTECTED] wrote: Hello R Users, How to convert decimal date to date as .mm.dd.hr.min.sec For example, I have decimal date in one column , and want to convert and write it in equivalent date(.mm.dd.hr.min.sec) in another next six columns. 1979.00 1979.020833 1979.041667 1979.062500 Is it possible in R ? Kindly help, Regards, Yogesh -- Dr. Yogesh K. Tiwari, Scientist, Indian Institute of Tropical Meteorology, Homi Bhabha Road, Pashan, Pune-411008 INDIA Phone: 0091-99 2273 9513 (Cell) : 0091-20-258 93 600 (O) (Ext.250) Fax: 0091-20-258 93 825 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
I would hope that you don't have 100 'scan' statements; you should just have a loop that is using a set of file names in a vector to read the data. Are you reading the data into separate objects? If so, have you considered reading the 100 files into a 'list' so that you have a single object with all of your data? This is then easy to save with the 'save' function and then you can quickly retrieve it with the 'load' statement. file.names - c('file1', ..., 'file100') input.list - list() for (i in file.names){ input.list[[i]] - scan(i, what=) } You can then 'save(input.list, file='save.Rdata')'. You can access the data from the individual files with: input.list[['file33']] On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
You don't have to name them after numbers. What I sent was just an example of a character vector with file names. If you have all the files in a directory, then you can set the loop to read in all the files (or selected one based on a pattern match). If you are copy/pasting the 'scan' command, then you must somehow be changing the file name that is being read and the R object that you are storing the values in. You can use list.files(pattern=..) to select a list of file names. This is much easier than copy/paste. On 8/8/07, Tiandao Li [EMAIL PROTECTED] wrote: I thought of loop at first. My data were generated from 32 microarray experiments, each had 3 replicates, 96 files in total. I named the files based on different conditions or time series, and I really won't want to name them after numbers. It will make me confused later when I need to refer/compare them. On Tue, 7 Aug 2007, jim holtman wrote: I would hope that you don't have 100 'scan' statements; you should just have a loop that is using a set of file names in a vector to read the data. Are you reading the data into separate objects? If so, have you considered reading the 100 files into a 'list' so that you have a single object with all of your data? This is then easy to save with the 'save' function and then you can quickly retrieve it with the 'load' statement. file.names - c('file1', ..., 'file100') input.list - list() for (i in file.names){ input.list[[i]] - scan(i, what=) } You can then 'save(input.list, file='save.Rdata')'. You can access the data from the individual files with: input.list[['file33']] On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Secondary axis
plot() par(new=TRUE) plot(...) axis(4,...) On 8/5/07, Patrick Martin [EMAIL PROTECTED] wrote: Dear R help list members, I am trying to plot two sets of data (both of which are zoo objects) in the same graph using two separate y-axes with different scales, with the x-axis consisting of dates. I have simply used a plot() command to plot first one set of data, and then added the second set with lines(). I have also tried to add a further y-axis (at side=4), but this simply comes up with the same scale as the first y-axis. I somehow need to 'associate' one of the data sets with the second y- axis, such that it will scale sensibly (my first data set ranges from 0-25, the second one from 0 to 40). The problem is compounded by the fact that the two data sets have very different frequencies: one consists of twice-monthly measurements, the other of hourly measurements. I would be very grateful for advice on how to do this. Thanks in advance, Patrick Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sink behavior
'sink' will capture 'printed' output from your program. Try: # Using a matrix because as a simple example. dumpMatrix = function(mat) { sink(file = mat.txt) print(mat) sink(NULL) } In this case, there is an explicit 'print' statement. At the command line, there is an implicit 'print' when you give an object name. On 8/6/07, Daniel Gatti [EMAIL PROTECTED] wrote: There is a package called 'safe' that produces an object which I can only write to a file using the sink() function. It works fine if the sink() command is not inside of a function, but it does not write anything to the file if the command is within a function. Sample code: # Using a matrix because as a simple example. dumpMatrix = function(mat) { sink(file = mat.txt) mat sink(NULL) } # This will write the file correctly. x = matrix(100, 10, 10) sink(file = x.txt) x sink(NULL) # This will create an empty file. dumpMatrix(x) R 2.5.1 Windows XP, SP2 The sink() docs are full of warnings, but I'm not clear which one I've violated with this example. Dan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Access an entry after reading a table
read.table will convert you character columns to factors. You are seeing a single value returned (A), but it is also reporting the levels for the factors. One way is to read the data in without conversion to factors: Model=read.table(ModelMat.txt, header=TRUE, as.is=TRUE) or you can convert the factor to character for output: as.character(Model[1,1]) On 8/2/07, Gang Chen [EMAIL PROTECTED] wrote: Sorry about this basic question. After reading a table, Model=read.table(ModelMat.txt, header=T) I want to get access to each entry in the table Model. However, if I do Model[1,1] I get the following, [1] A Levels: A B C My question is, how can I just get the entry A without the 2nd line (Levels: A B C)? Thanks, Gang __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Matrices
1.5688962.0142731.40849 1.730805 1.645146 1.7016340 1.121.0721741.238244 1.1136781.58344 1.6826671.55684 1.59504 1.423253 1.9618982.4072751.8014922.1238072.038148 1.7329661.120 0.5481740.977588 0.8530221.6147721.7139991.5881721.626372 1.4545851.99323 2.4386071.8328242.155139 2.06948 1.6384781.0721740.5481740 0.8831 0.758534 1.5202841.6195111.4936841.5318841.360097 1.8987422.3441191.7383362.0606511.974992 1.8045481.2382440.9775880.8831 0 0.728972 1.6863541.7855811.6597541.6979541.526167 2.0648122.5101891.9044062.2267212.141062 1.6799821.1136780.8530220.7585340.728972 0 1.5617881.6610151.5351881.573388 1.4016011.9402462.3856231.77984 2.102155 2.016496 1.8086821.58344 1.6147721.5202841.686354 1.5617880 0.4774211.1998761.238076 1.5303012.0689462.5143231.90854 2.230855 2.145196 1.9079091.6826671.7139991.6195111.785581 1.6610150.4774210 1.2991031.337303 1.6295282.1681732.61355 2.0077672.330082 2.244423 1.7820821.55684 1.5881721.4936841.659754 1.5351881.1998761.2991030 0.823034 1.5037012.0423462.4877231.88194 2.204255 2.118596 1.8202821.59504 1.6263721.5318841.697954 1.5733881.2380761.3373030.8230340 1.5419012.0805462.5259231.92014 2.242455 2.156796 1.3932571.4232531.4545851.3600971.526167 1.4016011.5303011.6295281.5037011.541901 0 1.6535212.0988981.4931151.81543 1.729771 1.5688961.9618981.99323 1.8987422.064812 1.9402462.0689462.1681732.0423462.080546 1.6535210 0.9887471.51847 1.8407851.755126 2.0142732.4072752.4386072.3441192.510189 2.3856232.5143232.61355 2.4877232.525923 2.0988980.9887470 1.9638472.286162 2.200503 1.40849 1.8014921.8328241.7383361.904406 1.77984 1.90854 2.0077671.88194 1.92014 1.4931151.51847 1.9638470 1.0544050.968746 1.7308052.1238072.1551392.0606512.226721 2.1021552.2308552.3300822.2042552.242455 1.81543 1.8407852.2861621.0544050 0.722953 1.6451462.0381482.06948 1.9749922.141062 2.0164962.1451962.2444232.1185962.156796 1.7297711.7551262.2005030.9687460.722953 0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple table with frequency variable
I am not exactly sure what you are asking for. I am assuming that you want a vector that represent the combinations that are given combinations that are present: N [1] 11 22 31 42 51 12 21 32 41 52 table(i,j) j i 1 2 1 1 1 2 1 1 3 1 1 4 1 1 5 1 1 z - table(i,j) which(z==1) [1] 1 2 3 4 5 6 7 8 9 10 which(z==1,arr.ind=T) row col 1 1 1 2 2 1 3 3 1 4 4 1 5 5 1 1 1 2 2 2 2 3 3 2 4 4 2 5 5 2 x - which(z==1,arr.ind=T) paste(rownames(z)[x[,'row']], colnames(z)[x[,'col']], sep='') [1] 11 21 31 41 51 12 22 32 42 52 On 8/1/07, G. Draisma [EMAIL PROTECTED] wrote: Hallo, Im trying to find out how to tabulate frequencies of factors when the data have a frequency variable. e,g: i-rep(1:5,2) j-rep(1:2,5) N-10*i+j table(i,j) gives a table of ones as each combination occurs only once. How does one get a table with the corresponding N's? Thanks! Gerrit. -- Gerrit Draisma Department of Public Health Erasmus MC, University Medical Center Rotterdam Room AE-103 P.O. Box 2040 3000 CA Rotterdam The Netherlands Phone: +31 10 4087124 Fax: +31 10 4638474 http://mgzlx4.erasmusmc.nl/pwp/?gdraisma __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem to remove loops in a routine
=), paste(, Group ,l,sep=,sep=)) trellis.par.set(par.xlab.text=list(cex=trellis.par.get(axis.text)[2])) trellis.par.set(par.ylab.text=list(cex=trellis.par.get(axis.text)[2])) print(myplot,panel.width=list(x=(0.75/nTrellisCol),units=npc),panel.height=list(x=(0.50/nTrellisRow),units=npc)) dev.off() __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to combine data of several csv-files
Here is the modified script for computing the 'sd': v1 - NA v2 - rnorm(6) v3 - rnorm(6) v4 - rnorm(6) v5 - rnorm(6) v6 - rnorm(6) v7 - rnorm(6) v8 - rnorm(6) v8 - NA list - list(v1,v2,v3,v4,v5,v6,v7,v8) categ - c(NA,cat1,cat1,cat1,cat2,cat2,cat2,NA) # create partitioned list list.cat - split(list, categ) # combine each partition into a matrix list.mat - lapply(list.cat, function(x) do.call('rbind', x)) # now take the means of each column lapply(list.mat, colMeans) # compute the 'sd' by using 'apply' on the columns lapply(list.mat, apply, 2, sd) On 7/31/07, 8rino-Luca Pantani [EMAIL PROTECTED] wrote: Hi Jim, that's exactly what I'm looking for. Thank you so much. I think, I should look for some further documentation on list handling. I think I will do the same... Thanks to Jim I learned textConnection and rowMeans. Jim, could you please go a step further and tell me how to use lapply to calculate the sd instead of the mean of the same items? I mean sd(-0.6442149 0.02354036 -1.40362589) sd(-1.1829260 1.17099178 -0.046778203) sd(-0.2047012 -1.36186952 0.13045724) etc x - read.table(textConnection( v1 v2 v3 v4 v5 v6 v7 v8 NA -0.6442149 0.02354036 -1.40362589 -1.1829260 1.17099178 -0.046778203 NA NA -0.2047012 -1.36186952 0.13045724 2.1411553 0.49248118 -0.233788840 NA NA -1.1986041 -0.42197792 -0.84651458 -0.1327081 -0.18690065 0.443908897 NA NA -0.2097442 1.50445971 1.57005071 -0.1053442 1.50050976 -1.649740180 NA NA -0.7343465 -1.76763996 0.06961015 -0.8179396 -0.65552410 0.003991354 NA NA -1.3888750 0.53722404 0.25269771 -1.2342698 -0.01243247 -0.228020092 NA), header=TRUE) -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading and storing files in the workspace
try: for (i in test){ assign(gsub(.txt$, , i), read.table(i, header=TRUE)) } On 7/31/07, Luis Ridao Cruz [EMAIL PROTECTED] wrote: R-help, I have a vector containing (test) some file names. The files contents are matrixes. test [1] aaOki.txtaOki.txt bOki.txt c1Oki.txt c2Oki.txtc3Oki.txtcOki.txt dOki.txt dyp100.txt dyp200.txt [11] dyp300.txt dyp400.txt dyp500.txt dyp600.txt dyp700.txt dyp800.txt eOki.txt FBdyp100.txt FBdyp150.txt FBdyp200.txt. What I want to do is to import to R using the same file name and remove the .txt extension out of the object name. Something like this: for(i in test) gsub(\\., , paste(i, sep = )) - read.table(file = paste(i, sep = ), header = TRUE) But I get the following message: Error in gsub(\\., , paste(i, sep = )) - read.table(file = paste(i, : target of assignment expands to non-language object Thanks in advance. version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 5.1 year 2007 month 06 day27 svn rev42083 language R version.string R version 2.5.1 (2007-06-27) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expressions : extracting numbers
Is this what you want: x [1] lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse [7] rbrbrb 12 rb rj 30%rb [13] rbrb 25%rbrb rbrj, rb gsub([^0-9]*([0-9]*)[^0-9]*, \\1, x) [1] 2 2 3 4 3 21230 25 On 7/30/07, GOUACHE David [EMAIL PROTECTED] wrote: Hello all, I have a vector of character strings, in which I have letters, numbers, and symbols. What I wish to do is obtain a vector of the same length with just the numbers. A quick example - extract of the original vector : lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb and the type of thing I wish to end up with : 2 2 3 4 3 2 12 30 25 or, instead of , NA would be acceptable (actually it would almost be better for me) Anyways, I've been battling with gsub() and things of the sort, but I'm drowning in the regular expressions, despite a few hours of looking at Perl tutorials... So if anyone can help me out, it would be greatly appreciated!! In advance, thanks very much. David Gouache Arvalis - Institut du Végétal Station de La Minière 78280 Guyancourt Tel: 01.30.12.96.22 / Port: 06.86.08.94.32 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to combine data of several csv-files
This should do it: v1 - NA v2 - rnorm(6) v3 - rnorm(6) v4 - rnorm(6) v5 - rnorm(6) v6 - rnorm(6) v7 - rnorm(6) v8 - rnorm(6) v8 - NA list - list(v1,v2,v3,v4,v5,v6,v7,v8) categ - c(NA,cat1,cat1,cat1,cat2,cat2,cat2,NA) # create partitioned list list.cat - split(list, categ) # combine each partition into a matrix list.mat - lapply(list.cat, function(x) do.call('rbind', x)) # now take the means of each column lapply(list.mat, colMeans) $cat1 [1] -0.5699080 0.3855693 1.1051809 0.2379324 0.6684713 0.3240003 $cat2 [1] 0.38160462 -0.10559496 -0.40963090 -0.09507354 0.95021406 -0.31491450 On 7/30/07, Antje [EMAIL PROTECTED] wrote: okay, I played a bit around and now I have some kind of testcase for you: v1 - NA v2 - rnorm(6) v3 - rnorm(6) v4 - rnorm(6) v5 - rnorm(6) v6 - rnorm(6) v7 - rnorm(6) v8 - rnorm(6) v8 - NA list - list(v1,v2,v3,v4,v5,v6,v7,v8) categ - c(NA,cat1,cat1,cat1,cat2,cat2,cat2,NA) list [[1]] [1] NA [[2]] [1] -0.6442149 -0.2047012 -1.1986041 -0.2097442 -0.7343465 -1.3888750 [[3]] [1] 0.02354036 -1.36186952 -0.42197792 1.50445971 -1.76763996 0.53722404 [[4]] [1] -1.40362589 0.13045724 -0.84651458 1.57005071 0.06961015 0.25269771 [[5]] [1] -1.1829260 2.1411553 -0.1327081 -0.1053442 -0.8179396 -1.2342698 [[6]] [1] 1.17099178 0.49248118 -0.18690065 1.50050976 -0.65552410 -0.01243247 [[7]] [1] -0.046778203 -0.233788840 0.443908897 -1.649740180 0.003991354 -0.228020092 [[8]] [1] NA now, I need the means (and sd) of element 1 of list[2],list[3],list[4] (because they belong to cat1) and = mean(-0.6442149, 0.02354036, -1.40362589) the same for element 2 up to element 6 (-- I would the get a vector containing the means for cat1) the same for the vectors belonging to cat2. does anybody now understand what I mean? Antje __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping through all possible combinations of cases
Here is how to do it for 2; you can extend it: # test data n - 100 x - data.frame(id=sample(letters[1:4], n, TRUE), values=runif(n)) # get combinations of 2 at a time comb.2 - combn(unique(as.character(x$id)), 2) for (i in 1:ncol(comb.2)){ + cat(sprintf(%s:%s %f\n,comb.2[1,i], comb.2[2,i], + sum(x$value[x$id %in% comb.2[,i]]))) + } c:d 25.259988 c:b 21.268737 c:a 21.250933 d:b 26.013253 d:a 25.995450 b:a 22.004198 On 7/27/07, Dimitri Liakhovitski [EMAIL PROTECTED] wrote: Hello! I have a regular data frame (DATA) with 10 people and 1 column ('variable'). Its cases are people with names ('a', 'b', 'c', 'd', 'e', 'f', etc.). I would like to write a function that would sum up the values on 'variable' of all possible combinations of people, i.e. 1. I would like to write a loop - in such a way that it loops through each possible pair of cases (i.e., ab, ac, ad, etc.) and sums up their respective values on 'variable' 2. I would like to write a loop - in such a way that it loops through each possible trio of cases (i.e., abc, abd, abe, etc.) and sums up their respective values on 'variable'. 3. I would like to write a loop - in such a way that it loops through each possible quartet of cases (i.e., abcd, abce, abcf, etc.) and sums up their respective values on 'variable'. etc. Then, at the end I want to capture all possible combinations that were considered (i.e., what elements were combined in it) and get the value of the sum for each combination. How should I do it? Thanks a lot! Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order by the columns
?order You could do something like: mat[order(mat[,1], mat[,2], mat[,3]),] On 7/29/07, Am Stat [EMAIL PROTECTED] wrote: Dear useR, I have a data matrix, it has n columns, each column is a two-level variable with entires -1 and +1. They are randomly generated, now I want to order them like (for example, 5 columns case) --- - - -- - -- . (first several rows are the samples with all variables in low level) + - -- - + - --- . - + -- - + + -- - + + + + + Is there any function in R that could let me do this order by Var1 then order by Var2 then...order by Var n Thanks very much in advance! Best, Leon [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv
Then you can just write a 'for' loop to write out each submatrix: for (i in 1:dim(x)[3]){ write.csv(x[,,i], paste(x, i, .csv, sep=)) } On 7/30/07, Dong GUO 郭东 [EMAIL PROTECTED] wrote: the dim of my results is (26,31,8) -(years, regions and variables). so, if i save each (years, regions) in 8 csv files, later, I could connect the (26,31) to dbf file in ArcGIS to show in a map. This is what I intend to do. I dont know a better way to do it directly in R... On 7/31/07, jim holtman [EMAIL PROTECTED] wrote: It really depends on how you want it output. You can use 'write.csv' to write an array out and it will be a 2-dimentional image that you could then reconstruct it from if you know what the dimensions were. What do you want to do with the data? If you are just going to read it back into R, then use save/load. On 7/29/07, Dong GUO 郭东 [EMAIL PROTECTED] wrote: Hi, I want to save an array(say, array[6,7,8]) write a cvs file. How can I do that??? can I write in one file? if I could not write in one file, i want to use a loop to save in different files (in the array[6,7,8], should be 8 csv files), such as the filename structure should be: file =filename +str(i) +. +csv Many thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to combine data of several csv-files
Is this what you want: x - read.table(textConnection( v1 v2 v3 v4 v5 v6 v7 v8 + 1 NA -0.6442149 0.02354036 -1.40362589 -1.1829260 1.17099178 -0.046778203 NA + 2 NA -0.2047012 -1.36186952 0.13045724 2.1411553 0.49248118 -0.233788840 NA + 3 NA -1.1986041 -0.42197792 -0.84651458 -0.1327081 -0.18690065 0.443908897 NA + 4 NA -0.2097442 1.50445971 1.57005071 -0.1053442 1.50050976 -1.649740180 NA + 5 NA -0.7343465 -1.76763996 0.06961015 -0.8179396 -0.65552410 0.003991354 NA + 6 NA -1.3888750 0.53722404 0.25269771 -1.2342698 -0.01243247 -0.228020092 NA), header=TRUE) categ - scan(textConnection(NAcat1cat1cat1 cat2cat2 cat2 NA), what='') Read 8 items cat.col - split(1:ncol(x), categ) lapply(cat.col, function(.cat){ + rowMeans(x[, .cat]) + }) $cat1 1 2 3 4 5 -0.674766810001 -0.47870449 -0.82236554 0.95492207 -0.81079210 6 -0.19965108 $cat2 123 45 -0.0195708076663 0.7999492133324 0.0414333823334 -0.0848582066670 -0.4898241153337 6 -0.4915741206667 On 7/30/07, Antje [EMAIL PROTECTED] wrote: Hello, thank you for your help. But I guess, it's still not what I want... printing df.my gives me df.my v1 v2 v3 v4 v5 v6 v7 v8 1 NA -0.6442149 0.02354036 -1.40362589 -1.1829260 1.17099178 -0.046778203 NA 2 NA -0.2047012 -1.36186952 0.13045724 2.1411553 0.49248118 -0.233788840 NA 3 NA -1.1986041 -0.42197792 -0.84651458 -0.1327081 -0.18690065 0.443908897 NA 4 NA -0.2097442 1.50445971 1.57005071 -0.1053442 1.50050976 -1.649740180 NA 5 NA -0.7343465 -1.76763996 0.06961015 -0.8179396 -0.65552410 0.003991354 NA 6 NA -1.3888750 0.53722404 0.25269771 -1.2342698 -0.01243247 -0.228020092 NA now, I have to combine like this: v1 v2 v3 v4 v5 v6 v7 v8 NAcat1cat1cat1 cat2cat2 cat2 NA -- mean(df.my$v2[1],df.my$v3[1],df.my$v4[1]) mean(df.my$v2[2],df.my$v3[2],df.my$v4[2]) mean(df.my$v2[3],df.my$v3[3],df.my$v4[3]) mean(df.my$v2[4],df.my$v3[4],df.my$v4[4]) mean(df.my$v2[5],df.my$v3[5],df.my$v4[5]) mean(df.my$v2[6],df.my$v3[6],df.my$v4[6]) the same for v5, v6 and v7 further, I'm not sure how to avoid the list, because this is the result of the processing I did before... Ciao, Antje 8rino-Luca Pantani schrieb: I hope I see. Why not try the following, and avoid lists, which I'm not still able to manage properly ;-) v1 - NA v2 - rnorm(6) v3 - rnorm(6) v4 - rnorm(6) v5 - rnorm(6) v6 - rnorm(6) v7 - rnorm(6) v8 - rnorm(6) v8 - NA (df.my - cbind.data.frame(v1, v2, v3, v4, v5, v6, v7, v8)) (df.my2 - reshape(df.my, varying=list(c(v1,v2,v3, v4,v5,v6,v7,v8)), idvar=sequential, timevar=cat, direction=long )) aggregate(df.my2$v1, by=list(category=df.my2$cat), mean) aggregate(df.my2$v1, by=list(category=df.my2$cat), function(x){sd(x, na.rm = TRUE)}) Antje ha scritto: okay, I played a bit around and now I have some kind of testcase for you: v1 - NA v2 - rnorm(6) v3 - rnorm(6) v4 - rnorm(6) v5 - rnorm(6) v6 - rnorm(6) v7 - rnorm(6) v8 - rnorm(6) v8 - NA list - list(v1,v2,v3,v4,v5,v6,v7,v8) categ - c(NA,cat1,cat1,cat1,cat2,cat2,cat2,NA) list [[1]] [1] NA [[2]] [1] -0.6442149 -0.2047012 -1.1986041 -0.2097442 -0.7343465 -1.3888750 [[3]] [1] 0.02354036 -1.36186952 -0.42197792 1.50445971 -1.76763996 0.53722404 [[4]] [1] -1.40362589 0.13045724 -0.84651458 1.57005071 0.06961015 0.25269771 [[5]] [1] -1.1829260 2.1411553 -0.1327081 -0.1053442 -0.8179396 -1.2342698 [[6]] [1] 1.17099178 0.49248118 -0.18690065 1.50050976 -0.65552410 -0.01243247 [[7]] [1] -0.046778203 -0.233788840 0.443908897 -1.649740180 0.003991354 -0.228020092 [[8]] [1] NA now, I need the means (and sd) of element 1 of list[2],list[3],list[4] (because they belong to cat1) and = mean(-0.6442149, 0.02354036, -1.40362589) the same for element 2 up to element 6 (-- I would the get a vector containing the means for cat1) the same for the vectors belonging to cat2. does anybody now understand what I mean? Antje __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What
Re: [R] problems saving and loading (PLMset) objects
you just need to say: load(expr.RData) You should not be assigning it to 'expr' since it is already 'load'ed On 7/30/07, Quin Wills [EMAIL PROTECTED] wrote: Hi I'm running the latest R on a presumably up to date Linux server. 'Doing something silly I'm sure, but can't see why my saved PLMset objects come out all wrong. To use an example: Setting up an example PLMset (I have the same problem no matter what example I use) library(affyPLM) data(Dilution) # affybatch object Dilution = updateObject(Dilution) options(width=36) expr - fitPLM(Dilution) This works, and I'm able to get the probeset coefficients with coefs(expr). until I save and try reloading: save(expr, file=expr.RData) rm(expr) # just to be sure expr - load(expr.RData) Now, running coefs(expr) says: Error in function (classes, fdef, mtable) : unable to find an inherited method for function coefs, for signature character Trying str(exp) just gives the following: chr exp expr.Rdata appears to save properly (in that there is an actual file with notable size in my working directory). Thanks in advance, Quin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deriv; loop
for question 1, is this what you want (BTW allocate 'result' to the size you want - the example keeps extending it which is OK for small numbers, but for larger size preallocate): result - numeric(0) for (i in 1:6) result[i] - i result [1] 1 2 3 4 5 6 prod(result) [1] 720 On 7/29/07, francogrex [EMAIL PROTECTED] wrote: Hi, 2 questions: Question 1: example of what I currently do: for(i in 1:6){sink(temp.txt,append=TRUE) dput(i+0) sink()} x=scan(file=temp.txt) print(prod(x)) file.remove(C:/R-2.5.0/temp.txt) But how to convert the output of the loop to a vector that I can manipulate (by prod or sum etc), without having to write and append to a file? Question 2: deriv(~gamma(x),x) expression({ .expr1 - gamma(x) .value - .expr1 .grad - array(0, c(length(.value), 1), list(NULL, c(x))) .grad[, x] - .expr1 * psigamma(x) attr(.value, gradient) - .grad .value }) BUT deriv3(~gamma(x),x) Error in deriv3.formula(~gamma(x), x) : Function 'psigamma' is not in the derivatives table What I want is the expression for the second derivative (which I believe is trigamma(x), or psigamma(x,1)), how can I obtain that? Thanks -- View this message in context: http://www.nabble.com/deriv--loop-tf4166283.html#a11853456 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the large dataset problem
FYI. I used your script on a Windows machine with 1.5GHZ and using the CYGWIN software that has the UNIX utilities. The field as 1000 lines with 10,000 fields on each line. Here is what it reported: gawk 'BEGIN{FS=,}{print $(1) , $(1000) , $(1275) , $(5678)}' tempxx.txt newdata.csv real0m0.806s user0m0.640s sys 0m0.124s So it took less than a second to process the file, so it still should be pretty fast on windows. BTW, the first run took 30 seconds of real time due to the slow disk that I have. The run above had the data already cached in memory. On 7/30/07, Ted Harding [EMAIL PROTECTED] wrote: On 30-Jul-07 11:40:47, Eric Doviak wrote: [...] Sympathies for the constraints you are operating in! The Introduction to R manual suggests modifying input files with Perl. Any tips on how to get started? Would Perl Data Language (PDL) be a good choice? http://pdl.perl.org/index_en.html I've not used SIPP files, but itseems that they are available in delimited format, including CSV. For extracting a subset of fields (especially when large datasets may stretch RAM resources) I would use awk rather than perl, since it is a much lighter program, transparent to code for, efficient, and it will do that job. On a Linux/Unix system (see below), say I wanted to extract fields 1, 1000, 1275, , 5678 from a CSV file. Then the 'awk' line that would do it would look like awk ' BEGIN{FS=,}{print $(1) , $(1000) , $(1275) , ... $(5678) ' sippfile.csv newdata.csv Awk reads one line at a tine, and does with it what you tell it to do. It will not be overcome by a file with an enormous number of lines. Perl would be similar. So long as one line fits comfortably into RAM, you would not be limited by file size (unless you're running out of disk space), and operation will be quick, even for very long lines (as an experiment, I just set up a file with 10,000 fields and 35 lines; awk output 6 selected fields from all 35 lines in about 1 second, on the 366MHz 128MB RAM machine I'm on at the moment. After transferring it to a 733MHz 512MB RAM machine, it was too quick to estimate; so I duplicated the lines to get a 363-line file, and now got those same fields out in a bit less than 1 second. So that's over 300 lines/second, 200,000 lines a minute, a million lines in 5 minutes; and all on rather puny hardware.). In practice, you might want to write a separate script which woould automatically create the necessary awk script (say if you supply the filed names, haing already coded the filed positions corresponding to filed names). You could exploit R's system() command to run the scripts from within R, and then load in the filtered data. I wrote a script which loads large datasets a few lines at a time, writes the dozen or so variables of interest to a CSV file, removes the loaded data and then (via a for loop) loads the next few lines I managed to get it to work with one of the SIPP core files, but it's SLW. See above ... Worse, if I discover later that I omitted a relevant variable, then I'll have to run the whole script all over again. If the script worked quickly (as with awk), presumably you wouldn't mind so much? Regarding Linux/Unix versus Windows. It is general experience that Linux/Unix works faster, more cleanly and efficiently, and often more reliably, for similar tasks; and cam do so on low grade hardware. Also, these systems come with dozens of file-processing utilities (including perl and awk; also many others), each of which has been written to be efficient at precisely the repertoire of tasks it was designed for. A lot of Windows sotware carries a huge overhead of either cosmetic dross, or a pantechnicon of functionality of which you are only going to need 0.01% at any one time. The Unix utilities have been ported to Windows, long since, but I have no experience of using them in that environment. Others, who have, can advise! But I'd seriously suggest getting hold of them. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 30-Jul-07 Time: 18:24:41 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained
Re: [R] Matrix Multiplication, Floating-Point, etc.
One thing to realize is that although it appears that the operations are the same, the code that is being executed is different in the two cases. Due to the different sequence of instructions, there may be round-off errors that are then introduced On 7/30/07, Talbot Katz [EMAIL PROTECTED] wrote: Thank you for responding! I realize that floating point operations are often inexact, and indeed, the difference between the two answers is within the all.equal tolerance, as mentioned in FAQ 7.31 (cited by Charles): (as.numeric(ev1%*%ev2))==(sum(ev1*ev2)) [1] FALSE all.equal((as.numeric(ev1%*%ev2)),(sum(ev1*ev2))) [1] TRUE I suppose that's good enough for numerical computation. But I was still surprised to see that matrix multiplication (ev1%*%ev2) doesn't give the exact right answer, whereas sum(ev1*ev2) does give the exact answer. I would've expected them to perform the same two multiplications and one addition. But I guess that's not the case. However, I did find that if I multiplied the two vectors by 10, making the entries integers (although the class was still numeric rather than integer), both computations gave equal answers of 0: xf1-10*ev1 xf2-10*ev2 (as.numeric(xf1%*%xf2))==(sum(xf1*xf2)) [1] TRUE Perhaps the moral of the story is that one should exercise caution and keep track of significant digits. -- TMK -- 212-460-5430home 917-656-5351cell From: Charles C. Berry [EMAIL PROTECTED] To: Talbot Katz [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] Matrix Multiplication, Floating-Point, etc. Date: Mon, 30 Jul 2007 09:27:42 -0700 7.31 Why doesn't R think these numbers are equal? On Fri, 27 Jul 2007, Talbot Katz wrote: Hi. I recently tried the following in R 2.5.1 on Windows XP: ev2-c(0.8,-0.6) ev1-c(0.6,0.8) ev1%*%ev2 [,1] [1,] -2.664427e-17 sum(ev1*ev2) [1] 0 (I got the same result with R 2.4.1 on a different Windows XP machine.) I expect this issue is very familiar and probably has been discussed in this forum before. Can someone please point me to some documentation or discussion about this? Is there some standard way to get the correct answer from %*%? Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
results=()#character() myVariableNames=names(x.val) results[length(myVariableNames)]-NA for (i in myVariableNames){ results[i]-names(x.val[[i]])# this does not work it returns a NULL (how can i convert this to x.val$somevalue ? ) } On 7/27/07, Allan Kamau [EMAIL PROTECTED] wrote: Hi All, I am having difficulties finding a way to find a substitute to the command names(v.val$PR14) so that I could generate the command on the fly for all PR14 to PR200 (please see the previous discussion below to understand what the object x.val contains) . I have tried the following results=()#character() myVariableNames=names(x.val) results[length(myVariableNames)]-NA for as.vector(unlist(strsplit(str,,)),mode=list) +results[i]-names(x.val$i)# this does not work it returns a NULL (how can i convert this to x.val$somevalue ? ) } Allan. - Original Message From: Allan Kamau [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Thursday, July 26, 2007 10:03:17 AM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. Thanks so much Jim, Andaikalavan, Gabor and others for the help and suggestions. The solution will result in a matrix containing nested matrices to enable each variable name, each variables distinct value and the count of the distinct value to be accessible individually. The main matrix will contain the variable names, the first level nested matrices will consist of the variables unique values, and each such variable entry will contain a one element vector to contain the count or occurrence frequency. This matrix can now be used in comparing other similar datasets for variable values and their frequencies. Building on the input received so far, a probable solution in building the matrix will include the following. 1)I reading the csv file (containing column headers) my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE) 2)I group the values in each variable producing an occurrence count(frequency) x.val-apply(my_data,2,table) 3)I obtain a vector of the names of the variables in the table names(x.val) 4)Now I make use of the names (obtained in step 3) to obtain a vector of distinct values in a given variable (in the example below the variable name is $PR14) names(v.val$PR14) 5)I obtain a vector (with one element) of the frequency of a value obtained from the step above (in our example the value is V) as.vector(x.val$PR14[V]) Todo: Now I will need to place the steps above in a script (consisting of loops) to build the matrix, step 4 and 5 seem tricky to do programatically. Allan. - Original Message From: jim holtman [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: Adaikalavan Ramasamy [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Sent: Wednesday, July 25, 2007 1:50:55 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. Also if you want to access the individual values, you can just leave it as a list: x.val - apply(x, 2, table) # access each value x.val$PR14[V] V 8 On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote: A subset of the data looks as follows df[1:10,14:20] PR10 PR11 PR12 PR13 PR14 PR15 PR16 1 VTIKVGD 2 VSIKVGG 3 VTIRVGG 4 VSIKIGG 5 VSIKVGG 6 VSIRVGG 7 VTIKIGG 8 VSIKVEG 9 VSIKVGG 10VSIKVGG The result I would like is as follows PR10PR11 PR12 ... [V:10][S:7,T:3][I:10] The result can be in a matrix or a vector and each variablename, value and frequency should be accessible so as to be used for comparisons with another dataset later. The frequency can be a count or a percentage. Allan. - Original Message From: Adaikalavan Ramasamy [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Tuesday, July 24, 2007 10:21:51 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. The name of the table should give you the value. And if you have a matrix, you just need to convert it into a vector first. m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 ) m [,1] [,2] [,3] [1,] A C B [2,] B D C [3,] C E D tb - table( as.vector(m) ) tb A B C D E 1 2 3 2 1 paste( names(tb), :, tb, sep= ) [1] A:1 B:2 C:3 D:2 E:1 If this is not what you want, then please give a simple example. Regards, Adai Allan Kamau wrote: Hi all, If the question below as been answered before I apologize for the posting. I
Re: [R] get() with complex objects?
'get' tries to retrieve the object given by the character string. The error message says that object can not be found. You actually have to 'evaluate' the character string. See the example below: x - data.frame(a=1:10, b=11:20) x$a [1] 1 2 3 4 5 6 7 8 9 10 z - 'x$a' get(z) Error in get(x, envir, mode, inherits) : variable x$a was not found # parse and evaluate the character string 'x$a' eval(parse(text=z)) [1] 1 2 3 4 5 6 7 8 9 10 Does this make sense? On 7/27/07, Mark Orr [EMAIL PROTECTED] wrote: Hello R-listers, I'm having trouble accessing sub objects (attributes?), e.g., x$silinfo$avg.width using the /get() /command; I'm using/ get()/ in a loop as illustrated in the following code: #FIRST MAKE CLUSTERS of VARYING k /for (i in 1:300){ assign(paste(x.,i,sep=),pam(x,i)) #WORKS FINE }/ #NEXT, TAKE LOOK AT AVE. SILHOUETTE VALUE FOR EACH k #PART 1, MAKE LIST OF OBJECTS NEEDED /gen.list - rep(t,300) for (i in 1:300){ assign(gen.list[i],paste(x.,i,$silinfo$avg.width,sep=)) } #WORKS FINE /#PART 2, USE LIST IN LOOP TO ACCESS OBJECT. /si//l.collector - rep(99,300) for(i in 1:300){ sil.collector - get(gen.list[i]) }/ #HERE IS THE ERROR /Error in get(x, envir, mode, inherits) : variable x.1$silinfo$avg.width was not found /So, I get the gist of this error; x.1 is an object findable from get(), but the attribute levels are not accessible. Any suggestions on how to get get() to access these levels? From reading the get()'s help page, I don't think it will access the attributes. (my apologies for loosely using the term attributes, but I hope it is clear). Thanks, Mark Orr -- *** Mark G. Orr, PhD Heilbrunn Dept. of Population and Family Health Columbia University 60 Haven Ave., B-2 New York, NY 10032 Tele: 212-304-7823 Fax: 212-305-7024 www.columbia.edu/~mo2259 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding matches in 2 files
Is this what you want? g1-c(gene1, gene2, gene3, gene4, gene5, gene9, gene10, + geneA) g2-c(gene6, gene9, gene1, gene2, gene7, gene8, gene9, + gene1, gene10) intersect(g1,g2) [1] gene1 gene2 gene9 gene10 On 7/25/07, jenny tan [EMAIL PROTECTED] wrote: I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create Strings of Column Id's
Is this what you want: paste(-, paste(colnames(MyMatrix)[COL], collapse='-'), sep='') [1] -E-T On 7/26/07, Tom.O [EMAIL PROTECTED] wrote: Does anyone know how this is don? I have a large matrix where I extract specific columns into txt files for further use. To be able to keep track of which txt files contain which columns I want to name the filenames with the column Id's. The most basic example would be to use an for() loop together with paste(), but the result is blank. Not even NULL. this is the concept of thecode i use: for example MyMatrix - matrix(NA,ncol=4,nrow=1,dimnames=list(NULL,c(E,R,T,Y))) COL - c(1,3) # a vector of columns I want to extract, Filename - NULL # the starting variable, so I can use paste Filename - for(i in colnames(MyMatrix)[COL]) {paste(Filename,-,i,sep=)} The result is -T, but I want it to be -E-T Anyone have a clue? Thanks Tom -- View this message in context: http://www.nabble.com/Create-Strings-of-Column-Id%27s-tf4153354.html#a11816439 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert string to list?
Is this what you want: str - P = 0.0, T = 0.0, Q = 0.0 x - eval(parse(text=paste('list(', str, ')'))) str(x) List of 3 $ P: num 0 $ T: num 0 $ Q: num 0 On 7/26/07, Manuel Morales [EMAIL PROTECTED] wrote: Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
Also if you want to access the individual values, you can just leave it as a list: x.val - apply(x, 2, table) # access each value x.val$PR14[V] V 8 On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote: A subset of the data looks as follows df[1:10,14:20] PR10 PR11 PR12 PR13 PR14 PR15 PR16 1 VTIKVGD 2 VSIKVGG 3 VTIRVGG 4 VSIKIGG 5 VSIKVGG 6 VSIRVGG 7 VTIKIGG 8 VSIKVEG 9 VSIKVGG 10VSIKVGG The result I would like is as follows PR10PR11 PR12 ... [V:10][S:7,T:3][I:10] The result can be in a matrix or a vector and each variablename, value and frequency should be accessible so as to be used for comparisons with another dataset later. The frequency can be a count or a percentage. Allan. - Original Message From: Adaikalavan Ramasamy [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Tuesday, July 24, 2007 10:21:51 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. The name of the table should give you the value. And if you have a matrix, you just need to convert it into a vector first. m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 ) m [,1] [,2] [,3] [1,] A C B [2,] B D C [3,] C E D tb - table( as.vector(m) ) tb A B C D E 1 2 3 2 1 paste( names(tb), :, tb, sep= ) [1] A:1 B:2 C:3 D:2 E:1 If this is not what you want, then please give a simple example. Regards, Adai Allan Kamau wrote: Hi all, If the question below as been answered before I apologize for the posting. I would like to get the frequencies of occurrence of all values in a given variable in a multivariate dataset. In short for each variable (or field) a summary of values contained with in a value:frequency pair, there can be many such pairs for a given variable. I would like to do the same for several such variables. I have used table() but am unable to extract the individual value and frequency values. Please advise. Allan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
Is this what you want: x - read.table(textConnection( PR10 PR11 PR12 PR13 PR14 PR15 PR16 + 1 VTIKVGD + 2 VSIKVGG + 3 VTIRVGG + 4 VSIKIGG + 5 VSIKVGG + 6 VSIRVGG + 7 VTIKIGG + 8 VSIKVEG + 9 VSIKVGG + 10VSIKVGG), header=TRUE) x.t - apply(x, 2, function(.col){ + .tab - table(.col) + paste('[', paste(names(.tab), .tab, sep=:, collapse=','), ']', sep='') + }) x.t PR10PR11PR12PR13PR14PR15 [V:10] [S:7,T:3][I:10] [K:8,R:2] [I:2,V:8] [E:1,G:9] PR16 [D:1,G:9] On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote: A subset of the data looks as follows df[1:10,14:20] PR10 PR11 PR12 PR13 PR14 PR15 PR16 1 VTIKVGD 2 VSIKVGG 3 VTIRVGG 4 VSIKIGG 5 VSIKVGG 6 VSIRVGG 7 VTIKIGG 8 VSIKVEG 9 VSIKVGG 10VSIKVGG The result I would like is as follows PR10PR11 PR12 ... [V:10][S:7,T:3][I:10] The result can be in a matrix or a vector and each variablename, value and frequency should be accessible so as to be used for comparisons with another dataset later. The frequency can be a count or a percentage. Allan. - Original Message From: Adaikalavan Ramasamy [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Tuesday, July 24, 2007 10:21:51 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. The name of the table should give you the value. And if you have a matrix, you just need to convert it into a vector first. m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 ) m [,1] [,2] [,3] [1,] A C B [2,] B D C [3,] C E D tb - table( as.vector(m) ) tb A B C D E 1 2 3 2 1 paste( names(tb), :, tb, sep= ) [1] A:1 B:2 C:3 D:2 E:1 If this is not what you want, then please give a simple example. Regards, Adai Allan Kamau wrote: Hi all, If the question below as been answered before I apologize for the posting. I would like to get the frequencies of occurrence of all values in a given variable in a multivariate dataset. In short for each variable (or field) a summary of values contained with in a value:frequency pair, there can be many such pairs for a given variable. I would like to do the same for several such variables. I have used table() but am unable to extract the individual value and frequency values. Please advise. Allan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if - else
try: Start - ifelse (DateFirstEven DateSecondEvent, (DateFirstEvent+DateSecondEvent)/2, DateFound) On 7/25/07, James J. Roper [EMAIL PROTECTED] wrote: Greetings, I have some confusion with the use of if - else. Let's say I have a four variables as follows: Condition DateFound DateFirstEvent DateSecondEvent NA10Jan2000 NA NA 0 05Jan2000 07Jan2000 10Jan2000 1 07Jan2000 07Jan2000 08Jan2000 2 09Jan2000 NA NA Now, what I need to do is make a new variable that is either the midpoint of the first and second event dates, or the date found (I will call Start). I tried an if - else condition as follows: Start - if (DateFirstEven DateSecondEvent) (DateFirstEvent+DateSecondEvent)/2 else DateFound I also tried Start - if (any(DateFirstEven DateSecondEvent)) (DateFirstEvent+DateSecondEvent)/2 else DateFound Only the first half of the expression was ever evaluated. I hope I have not been to brief, and will certainly appreciate any help. Thanks, Jim -- James J. Roper Population Dynamics and Conservation of Terrestrial Vertebrates Caixa Postal 19034 81531-990 Curitiba, Paraná, Brasil === E-mail: [EMAIL PROTECTED] Phone/Fone/Teléfono: 55 41 33611764 celular: 55 41 99870543 Casa: 55 41 33857249 === Ecologia e Conservação na UFPR http://www.bio.ufpr.br/ecologia/ --- http://jjroper.googlepages.com/ http://arsartium.googlepages.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Passing equations as arguments
Here is one possible solution: ifun - function(a, b, FUN){ evala - FUN(a) evalb - FUN(b) if (evala evalb) return(evala) else return(evalb) } ifun(1,2,function(x) (x*x) - 2) On 7/24/07, Anup Nandialath [EMAIL PROTECTED] wrote: Friends, I'm trying to pass an equation as an argument to a function. The idea is as follows. Let us say i write an independent function Ideal Situation: ifunc - function(x) { return((x*x)-2) } mainfunc - function(a,b) { evala - ifunc(a) evalb - ifunc(b) if (evalaevalb){return(evala)} else return(evalb) } Now I want to try and write this entire program in a single function with the user specifying the equation as an argument to the function. myfunc - function(a, b, eqn) { func1 - function (x) ?? { return(eqn in terms of x) ?? } Further arguments to check The imply that this does not seem to be correct. The idea is how to assign the equation expression from the main equation into the inner function. Is there anyway to do that within this set up? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] code optimization tips
First question is why are you defining the functions within the main function each time? Why don't you define them once outside? On 7/23/07, baptiste Auguié [EMAIL PROTECTED] wrote: Hi, Being new to R I'm asking for some advice on how to optimize the performance of the following piece of code: alpha_c - function(lambda=600e-9,alpha_s=1e-14,N=400,spacing=1e-7){ k-2*pi/lambda ri-c(0,0) # particle at the origin x-c(-N:N) positions - function(N) { reps - 2*N+1 matrix(c(rep(-N:N, each = reps), rep(-N:N, times = reps)), nrow = 2, byrow = TRUE) } rj-positions(N)*spacing # all positions in the 2N x 2N array rj-rj[1:2,-((dim(rj)[2]-1)/2+1)] # remove rj=(0,0) mod-function(x){sqrt(x[1]^2+x[2]^2)} # modulus sij -function(rj){ rij=mod(rj-ri) cos_ij=rj[1]/rij sin_ij=rj[2]/rij A-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3) B-k^2*sin_ij^2*exp(1i*k*rij)/rij sij-A+B } s_ij-apply(rj,2,sij) S-sum(s_ij) alpha_s/(1-alpha_s*S) } alpha_c() This function is to be called for a few tens of values of lambda in a 'for' loop, and possibly a couple of different N and spacing (their magnitude is typically around the default one). This can be a bit slow ––– not that I would expect otherwise --- and I wonder if there is something I could do to optimize it (vectorize with respect to the lambda parameter?, change the units of the problem to deal with numbers closer to unity?,...) Best regards, baptiste __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] code optimization tips
The quote what is the problem you are trying to solve is just part of my signature. I used to review projects for performance and architecture and that was the first question I always asked them. To pass the argument, if you notice the definition of apply: apply(X, MARGIN, FUN, ...) the ... are optional argument, so for your function: sij -function(rj,ri,k){ rij=mod(rj-ri) cos_ij=rj[1]/rij sin_ij=rj[2]/rij A-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3) B-k^2*sin_ij^2*exp(1i*k*rij)/rij sij-A+B } you would call apply with the following: s_ij-apply(rj,2,sij, ri=ri, k=k) On 7/23/07, baptiste Auguié [EMAIL PROTECTED] wrote: Thanks for your reply, On 23 Jul 2007, at 15:19, jim holtman wrote: First question is why are you defining the functions within the main function each time? Why don't you define them once outside? Fair enough! As said, I'm new to R and don't know whether it is best to define functions outside and pass to them all necessary arguments, or nest them and get variables in the scope from parents. In any case, I'd agree my positions(), mod() and sij() functions would be better outside. Here is a corrected version (untested as something else is running), positions - function(N) { reps - 2*N+1 matrix(c(rep(-N:N, each = reps), rep(-N:N, times = reps)), nrow = 2, byrow = TRUE) } mod-function(x){sqrt(x[1]^2+x[2]^2)} # modulus sij -function(rj,ri,k){ rij=mod(rj-ri) cos_ij=rj[1]/rij sin_ij=rj[2]/rij A-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3) B-k^2*sin_ij^2*exp(1i*k*rij)/rij sij-A+B } alpha_c - function(lambda=600e-9,alpha_s=1e-14,N=400,spacing=1e-7){ k-2*pi/lambda ri-c(0,0) # particle at the origin rj-positions(N)*spacing # all positions in the 2N x 2N array rj-rj[1:2,-((dim(rj)[2]-1)/2+1)] # remove rj=(0,0) s_ij-apply(rj,2,sij) *** Now, how do I pass k and ri to this function ? *** S-sum(s_ij) alpha_s/(1-alpha_s*S) } alpha_c() -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Wondering whether that's part of the signature? the problem is related to scattering by arrays of particles, more specifically to evaluate the array influence on the effective polarizability (alpha) of a particle via dipolar radiative coupling. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframe of factors transform speed?
, it otherwise is ncol(genoT) instead of 10 +gt-genoT[[j]] #-- this is to avoid 2D indices +for(l in 1:length([EMAIL PROTECTED])){ + levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2) #-- convert levels to 0,1, or 2 + genoT[[j]]-factor(gt,levels=0:2) #-- make a 3-level factor and put it back +} + } + ) [1] 785.085 4.358 789.454 0.000 0.000 789s for 10 columns only! To me it seems like replacing 10 x 3 levels and then making a factor of 1002 element vector x 10 is a negligible amount of operations needed. So, what's wrong with me? Any idea how to accelerate significantly the transformation or (to go to the very beginning) to make read.table use a fixed set of levels (AA,AB, and BB) and not to drop any (missing) level? R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not it. Thank you very much for the help, Latchezar Dimitrov, Analyst/Programmer IV, Wake Forest University School of Medicine, Winston-Salem, North Carolina, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframe of factors transform speed?
The problem is in the way that 'as.data.frame' works. Use Rprof on a small list and you will see where it is spending its time. Now if you are really sure that all your data is consistent with being a data frame, you can create your own dataframe structure your self. Not that I would advocate it, but if you look at the output of 'dput' on a dataframe, you can construct your own. Here it took 20 seconds to create the test data with a list of 50,000 and only 2 seconds to create the data frame from that. set.seed(123) n - 5 system.time({ + genoT - lapply(1:n, function(i) factor(sample(c(AA, + AB, BB), 1000, prob=c(1000, 1, 1), rep=T))) + }) user system elapsed 20.850.12 22.83 names(genoT) = paste(snp, 1:n, sep=) # create your own data frame structure -- if you are real sure of your data system.time(genoTz - structure(genoT, .Names=names(genoT), + row.names=c(NA, -length(genoT[[1]])), class='data.frame')) user system elapsed 2.000.082.11 str(genoTz) 'data.frame': 1000 obs. of 5 variables: $ snp1: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ snp2: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ snp3: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ snp4: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ snp5: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ snp6: Factor w/ 2 levels AA,AB: 1 1 1 1 1 1 1 1 1 1 ... $ snp7: Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... $ snp8: Factor w/ 2 levels AA,BB: 1 1 1 1 1 1 1 1 1 1 ... $ snp9: Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ snp10 : Factor w/ 3 levels AA,AB,BB: 1 1 1 1 1 1 1 1 1 1 ... $ snp11 : Factor w/ 1 level AA: 1 1 1 1 1 1 1 1 1 1 ... On 7/21/07, Latchezar Dimitrov [EMAIL PROTECTED] wrote: Jim, No, this is _not the problem. If you go to my 1st mail I have a monster (at least was when I purchased it) with 32GB (sic :-) of RAM and 4 dual core AMD64 285 (the fastest at that time and still pretty fast now :-) The machine stats paging when I run 2 copies of R working on two things like that :-). If you look at my last e-mail I found a solution but still have no clue why the heck x-as.data.frame(y) where why is a list of the same columns take real for ever and this the thing that killed me before. Thanks, Latchezar -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Saturday, July 21, 2007 5:33 PM To: Latchezar Dimitrov Cc: Benilton Carvalho; r-help@stat.math.ethz.ch Subject: Re: [R] Dataframe of factors transform speed? One of the problems is that you are probably paging on your system with an object that size (24 x 1000). This is about 1GB for a single object: set.seed(123) n - 24 system.time({ + genoT - lapply(1:n, function(i) factor(sample(c(AA, AB, BB), + 1000, prob=c(1000, 1, 1), rep=T))) + }) user system elapsed 95.000.61 104.71 names(genoT) = paste(snp, 1:n, sep=) object.size(genoT) [1] 1045258752 I can create it on my 2GB machine as a list, but have problems converting it to a dataframe because I don't have enough memory. So unless you have at least 4GB on your system, it might take a long time. Look at your performance measurements on your system and see if you have run out of physical memory and are paging. On 7/21/07, Latchezar Dimitrov [EMAIL PROTECTED] wrote: Hi, Thanks for the help. My 1st question still unanswered though :-) Please see bellow -Original Message- From: Benilton Carvalho [mailto:[EMAIL PROTECTED] Sent: Friday, July 20, 2007 3:30 AM To: Latchezar Dimitrov Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Dataframe of factors transform speed? set.seed(123) genoT = lapply(1:24, function(i) factor(sample(c(AA, AB, BB), 1000, prob=sample(c(1, 1000, 1000), 3), rep=T))) names(genoT) = paste(snp, 1:24, sep=) genoT = as.data.frame(genoT) Now this _is the problem. Everything before converting to data.frame worked almost instantaneously however as.data.frame runs forever. Obviously there is some scalability memory management issue. When I tried my own method but creating a new result (instead of modifying the old) dataframe it worked like a charm for the 1st 100 cols ~ .3s. I figured 300,000 cols should be ~1000s. Nope! It ran for about 50,000(!)s to finish about 42,000 cols only. BTW, what ver. of R is yours? Now here's what I discovered further. #-- create a 1-col frame: geno - data.frame(c(geno.GASP[[1]],geno.JAG[[1]]),row.names=c(rownames(geno.G AS P),rownames(geno.JAG))) #-- main code I repeated it w/ j in 1:1000, 2001:3000, and 3001:4000, i.e., adding a 1000 of cols to geno each time system.time( # for(j in 1:(ncol(geno.GASP ))){ for(j in 3001:(4000 )){ gt.GASP-geno.GASP
Re: [R] binned column in a data.frame
You can also use 'cut' to break the bins: x - c(1,2,6,8,13,0,5,10, runif(10) * 100) x.bins - seq(0, max(x)+5, 5) x.cut - cut(x, breaks=x.bins, include.lowest=TRUE) x.names - paste(head(x.bins, -1), tail(x.bins, -1), sep='-') data.frame(x, bins=x.names[x.cut]) x bins 1 1.0 0-5 2 2.0 0-5 3 6.0 5-10 4 8.0 5-10 5 13.0 10-15 6 0.0 0-5 7 5.0 0-5 8 10.0 5-10 9 75.85256 75-80 10 38.20424 35-40 11 77.30647 75-80 12 62.02278 60-65 13 73.42095 70-75 14 78.69244 75-80 15 66.52972 65-70 16 61.64897 60-65 17 23.99252 20-25 18 42.08632 40-45 On 7/20/07, João Fadista [EMAIL PROTECTED] wrote: Dear all, I would like to know how can I create a binned column in a data.frame. The output that I would like is something like this: Start Binned_Start 10-5 20-5 65-10 85-10 13 10-15 ... Best regards João Fadista Ph.d. student UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1900 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Web: www.agrsci.org http://www.agrsci.org/ News and news media http://www.agrsci.org/navigation/nyheder_og_presse . This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SOS
You can use sprintf: x - runif(5) x [1] 0.89838968 0.94467527 0.66079779 0.62911404 0.06178627 cat(sprintf(%.2f%% , x * 100)) 89.84% 94.47% 66.08% 62.91% 6.18% On 7/20/07, Fabrice McShort [EMAIL PROTECTED] wrote: Hi Julian, Thank you very much. Please let me know how to get 2 numbers after the decim. Best regards, Fabrice Date: Fri, 20 Jul 2007 08:15:42 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] SOS Multiply by 100? Add R=R*100 Fabrice McShort wrote: Dear all, I am a new user of R. I would like to know how to get fund's returns in percentage (%). For example, I use: R - ts(read.xls(FundData), frequency = 12, start = c(1996, 1)) Whith this program, the returns are like 0.0152699. But, I would like to have 1.52%. Please advise me about the function. Thanks! Fabrice _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.! ! -- Julian M. Burgos Fisheries Acoustics Research Lab School of Aquatic and Fishery Science University of Washington 1122 NE Boat Street Seattle, WA 98105 Phone: 206-221-6864 _ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot3d labels
The documentation has: text3d(x, y = NULL, z = NULL, texts, adj = 0.5, justify, ...) Do this do it for you? On 7/19/07, Birgit Lemcke [EMAIL PROTECTED] wrote: Hello R users, I am a newby using R 2.5.0 on a Apple Power Book G4 with Mac OS X 10.4.10. Sorry that I ask again such stupid questions, but I haven´t found how to label the points created with plot3d (rgl). Hope somebody can help me. Thanks in advance. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear interpolation of multiple random time series
This should do it for you: x - read.table(textConnection(trial timex + 1 1 1 + 1 5 4 + 1 7 9 + 1 12 20 + 2 1 0 + 2 3 5 + 2 9 10 + 2 13 14 + 2 19 22 + 2 24 32), header=TRUE) # compute for each trial trial.list - lapply(split(x, x$trial), function(set){ + .xval - seq(min(set$time), max(set$time)) + .yval - approx(set$time, set$x, xout=.xval)$y + cbind(trial=set$trial[1], time=.xval, x=.yval) + }) do.call('rbind', trial.list) trial time x [1,] 11 1.00 [2,] 12 1.75 [3,] 13 2.50 [4,] 14 3.25 [5,] 15 4.00 [6,] 16 6.50 [7,] 17 9.00 [8,] 18 11.20 [9,] 19 13.40 [10,] 1 10 15.60 [11,] 1 11 17.80 [12,] 1 12 20.00 [13,] 21 0.00 [14,] 22 2.50 [15,] 23 5.00 [16,] 24 5.83 [17,] 25 6.67 [18,] 26 7.50 [19,] 27 8.33 [20,] 28 9.17 [21,] 29 10.00 [22,] 2 10 11.00 [23,] 2 11 12.00 [24,] 2 12 13.00 [25,] 2 13 14.00 [26,] 2 14 15.33 [27,] 2 15 16.67 [28,] 2 16 18.00 [29,] 2 17 19.33 [30,] 2 18 20.67 [31,] 2 19 22.00 [32,] 2 20 24.00 [33,] 2 21 26.00 [34,] 2 22 28.00 [35,] 2 23 30.00 [36,] 2 24 32.00 On 7/19/07, Mike Lawrence [EMAIL PROTECTED] wrote: Hi all, Looking for tips on how I might more optimally solve this. I have time series data (samples from a force sensor) that are not guaranteed to be sampled at the same time values across trials. ex. trial timex 1 1 1 1 5 4 1 7 9 1 12 20 2 1 0 2 3 5 2 9 10 2 13 14 2 19 22 2 24 32 Within each trial I'd like to use linear interpolation between each successive time sample to fill in intermediary timepoints and x- values, ex. trial timex 1 1 1 1 2 1.75 1 3 2.5 1 4 3.25 1 5 4 1 6 6.5 1 7 9 1 8 11.2 1 9 13.4 1 10 15.6 1 11 17.8 1 12 20 2 1 0 2 2 2.5 2 3 5 2 4 5.83 2 5 6.67 2 6 7.5 2 7 8.33 2 8 9.17 2 9 10 2 10 11 2 11 12 2 12 13 2 13 14 2 14 15.3 2 15 16.7 2 16 18 2 17 19.3 2 18 20.7 2 19 22 2 20 24 2 21 26 2 22 28 2 23 30 2 24 32 The solution I've coded (below) involves going through the original data frame line by line and is thus very slow (indeed, I had to resort to writing to file as with a large data set I started running into memory issues if I tried to create the new data frame in memory). Any suggestions on a faster way to achieve what I'm trying to do? #assumes the first data frame above is stored as 'a' arows = (length(a$x)-1) write('', 'temp.txt') for(i in 1:arows){ if(a$time[i+1] a$time[i]){ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) x1 = a$time[i] x2 = a$time[i+1] dx = x2-x1 if(dx != 1){ y1 = a$x[i] y2 = a$x[i+1] dy = y2-y1 slope = dy/dx int = -slope*x1+y1 temp=a[i,] for(j in (x1+1):(x2-1)){ temp$time = j temp$x = slope*j+int write.table(temp, 'temp.txt', row.names = F, col.names = F, append = T) } } }else{ write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) } } i=i+1 write.table(a[i,], 'temp.txt', row.names = F, col.names = F, append = T) b=read.table('temp.txt',skip=1) names(b)=names(a) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch
Re: [R] Help with Dates
Try some of the following: head(subset(df, Yr %in% c(00,01,02,03))) subset(df, (Yr = '00') (Yr = '03')) # same as above subset(df, (Yr == '00') | (Yr == '01') | (Yr == '02') |(Yr == '03')) # same On 7/19/07, Alex Park [EMAIL PROTECTED] wrote: R I am taking an excel dataset and reading it into R using read.table. (actually I am dumping the data into a .txt file first and then reading data in to R). Here is snippet: head(data); Date Price Open.Int. Comm.Long Comm.Short net.comm 1 15-Jan-86 673.25175645 65910 2842537485 2 31-Jan-86 677.00167350 54060 2712026940 3 14-Feb-86 680.25157985 37955 2542512530 4 28-Feb-86 691.75162775 49760 1603033730 5 14-Mar-86 706.50163495 54120 2799526125 6 31-Mar-86 709.75164120 54715 3039024325 The dataset runs from 1986 to 2007. I want to be able to take subsets of my data based on date e.g. data between 2000 - 2005. As it stands, I can't work with the dates as they are not in correct format. I tried successfully converting the dates to just the year using: transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'), %y))) This gives the following format: Date Price Open.Int. Comm.Long Comm.Short net.comm Yr 1 15-Jan-86 673.25175645 65910 2842537485 86 2 31-Jan-86 677.00167350 54060 2712026940 86 3 14-Feb-86 680.25157985 37955 2542512530 86 4 28-Feb-86 691.75162775 49760 1603033730 86 5 14-Mar-86 706.50163495 54120 2799526125 86 6 31-Mar-86 709.75164120 54715 3039024325 86 I can subset for a single year e.g: head(subset(df, Yr ==00) But how can I subset for multiple periods e.g 00- 05? The following won't work: head(subset(df, Yr ==00 Yr==01) or head(subset(df, Yr = c(00,01,02,03) I can't help but feeling that I am missing something and there is a simpler route. I leafed through R newletter 4.1 which deals with dates and times but it seemed that strptime and POSIXct / POSIXlt are not what I need either. Can anybody help me? Regards Alex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can I paste 'newline'?
Notice the difference: cat ('I need to move on to a new line', '\n', 'at here') # change line! I need to move on to a new line at here paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a [1] I need to move on to a new line \n at here cat(paste ('I need to move on to a new line', '\n', 'at here')) I need to move on to a new line at here paste(a long string + with carriage + returns) [1] a long string\nwith carriage\nreturns cat(paste(a long string + with carriage + returns)) a long string with carriage returns paste is showing you the characters in the string; cat is acutally outputting to a print device where '\n' is a line feed. On 7/19/07, runner [EMAIL PROTECTED] wrote: It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'. e.g. cat ('I need to move on to a new line', '\n', 'at here') # change line! paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a character as it is. Is there a way around pasting '\n' ? Thanks a lot. -- View this message in context: http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframe of factors transform speed?
with different number of levels (from 1 to 3 - that's what I got from read.table, i.e., it dropped missing levels). I want to convert it to uniform factors with 3 levels. The 1st 10 rows above show already converted columns and the rest are not yet converted. Here's my attempt wich is a complete failure as speed: system.time( + for(j in 1:(10 )){ #-- this is to try 1st 10 cols and measure the time, it otherwise is ncol(genoT) instead of 10 +gt-genoT[[j]] #-- this is to avoid 2D indices +for(l in 1:length([EMAIL PROTECTED])){ + levels(gt)[l] - switch([EMAIL PROTECTED],AA=0,AB=1,BB=2) #-- convert levels to 0,1, or 2 + genoT[[j]]-factor(gt,levels=0:2) #-- make a 3-level factor and put it back +} + } + ) [1] 785.085 4.358 789.454 0.000 0.000 789s for 10 columns only! To me it seems like replacing 10 x 3 levels and then making a factor of 1002 element vector x 10 is a negligible amount of operations needed. So, what's wrong with me? Any idea how to accelerate significantly the transformation or (to go to the very beginning) to make read.table use a fixed set of levels (AA,AB, and BB) and not to drop any (missing) level? R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not it. Thank you very much for the help, Latchezar Dimitrov, Analyst/Programmer IV, Wake Forest University School of Medicine, Winston-Salem, North Carolina, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove columns having a partial match name
DATA_OK - DATA[-grep(^Start, names(DATA)),] On 7/18/07, João Fadista [EMAIL PROTECTED] wrote: Dear all, I would like to know how can I retrieve a data.frame without the columns that have a partial match name. Let´s say that I have a data.frame with 200 columns and 100 of them have the name StartX, with X being the unique part for each column name. I want to delete all columns that have the name starting with Start. I´ve tried to do this but it doesn´t work: DATA_OK - DATA[,-match((Start*),names(DATA))] dim(DATA_OK) NULL Thanks in advance. Best regards João Fadista Ph.d. student UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1900 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Web: www.agrsci.org http://www.agrsci.org/ News and news media http://www.agrsci.org/navigation/nyheder_og_presse . This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
You can use 'cut': x MD 1 0.20 2 0.10 3 0.80 4 0.30 5 0.70 6 0.60 7 0.01 8 0.20 9 0.50 10 1.00 11 1.00 cut(x$MD, breaks=seq(0,1,.2), include.lowest=TRUE, labels=LETTERS[1:5]) [1] A A D B D C A A C E E Levels: A B C D E On 7/18/07, Ing. Michal Kneifl, Ph.D. [EMAIL PROTECTED] wrote: Hi, I am also a quite new user of R and would like to ask you for help: I have a data frame where all columns are numeric variables. My aim is to convert one columnt in factors. Example: MD 0.2 0.1 0.8 0.3 0.7 0.6 0.01 0.2 0.5 1 1 I want to make classes: 0-0.2 A 0.21-0.4 B 0.41-0.6 C . and so on So after classification I wil get: MD A A D B . . . and so on Please could you give an advice to a newbie? Thanks a lot in advance.. Michael __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nested for loop
This should create your files for you: x - 1:1080 # test data # create a vector of 30 consecutive values for spliting the data breaks - rep(1:ceiling(length(x) / 30), each=30)[1:length(x)] # now partition the data into 30 values and write them fileNo - 1 # initialize the file number invisible(lapply(split(x, breaks), function(.values){ write(.values, file=sprintf(NWRxx.%03d.txt, fileNo)) fileNo - fileNo + 1 # update the file number })) On 7/18/07, Sherri Heck [EMAIL PROTECTED] wrote: Hi, I am new to programming and R. I am reading the manual and R books by Dalgaard and Veranzo to help answer my questions but I am unable to figure out the following: I have a data file that contains 1080 data points. Here's a snippet of the file: [241] 0.3603704000 0.1640741000 0.2912963000 NA 0.0159259300 0.0474074100 I would like to break the file up into 30 consecutive data point segments and then write each segment into a separate data file. This is one version of code that I've tried. mons = c(1:12) data = scan(paste(C:/R/NWR.txt)) for (mon in mons) { for (i in c(1:30)) { for (j in data){ write((data),paste(mon,'NWR dc_dt_zi ppm meters per sec.txt',sep=''),ncol=1) } } } I think I'm really close, but no cigar. Thanks in advance for any help- S.Heck Graduate Research Assistant University of Colorado, Boulder __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] set up automatic running of R
Create a .bat file with the commands to execute R BATCH and then create a scheduled task that will run at the desired time to call the batch file. On 7/18/07, Am Stat [EMAIL PROTECTED] wrote: Hi useR, I am trying to find how to schedule an automatic run of R periodically, I have written some scripts to extract data which are updated monthly on another server, my os is xp. The goal is that my script will run at a scheduled time every month and record the results to some directories. Now the scripts are done, only thing I need is to know how to let R run my scripts at a certain time, say the first Sunday of each months. Could anyone give me some clues? Thanks a million in advance! Best, Leon [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory error with 64-bit R in linux
Are you paging? That might explain the long run times. How much space are your other objects taking up? The matrix by itself should only require about 13MB if it is numeric. I would guess it is some of the other objects that you have in your working space. Put some gc() in your loop to see how much space is being used. Run it with a subset of the data and see how long it takes. This might give you an estimate of the time, and space, that might be needed for the entire dataset. Do a 'ps' to see how much memory your process is using. Do one every couple of minutes to see if it is growing. You can alway use Rprof() to get an idea of where time is being spent (use it on a small subset). On 7/18/07, zhihua li [EMAIL PROTECTED] wrote: Hi netters, I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB. The operating system is SUSE 10. The system information is: -uname -a Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux I used heatmap to process a matrix of the dim [16000,100]. After 3 hours of desperating waiting, R told me: cannot allocate vector of size 896 MB. I know the matrix is very big, but since I have 2 GB of RAM and in a 64-bit system, there should be no problem to deal with a vector smaller than 1 GB? (I was not running any other applications in my system) Does anyone know what's going on? Is there a hardware limit where I have to add more RAM, or is there some way to resolve it softwarely? Also is it possible to speed up the computing (I don't wanna wait another 3 hours to know I get another error message) Thank you in advance! _ 享用世界上最大的电子邮件系统― MSN Hotmail。 http://www.hotmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving a dataset permanently in R
Where are you trying to copy data from? I would assume that with that script you are typing all the data in by hand. Why don't you put it in a text file and use read.table? By default, R will save your workspace on exit and then reload it on startup. Is this enough to save your data? You can also use the 'save' function to store explicit objects. On 7/18/07, Felipe Carrillo [EMAIL PROTECTED] wrote: HI: I'm still struggling with datasets, the more I read about it the more confussed I get. This is the scenario... In R console|Edit|Data Editor, I can find all the datasets available with the different packages, So to create a new dataset in the R console I use the following commands to create an empty data frame. My_Dataset - data.frame() My_Dataset - edit(My_dataset) The problem is that I can't copy my data into the dataframe. Is there any suggestions as of how I can transfer the data and how it can be saved so everytime I open R the dataset would be available.? Thanks Felipe D. Carrillo Fishery Biologist US Fish Wildlife Service Red Bluff, California 96080 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory error with 64-bit R in linux
The output from gc() indicates that you had a maximum usage of 476MB+119MB=~600MB. If you look at the output of ps you will notice that the process size is 523MB (or about 500MB if you want to be exact). So you are using about 25% of the 2GB that you have available. mem.limit just shows the current value of the parameters, and as the help file says: Value mem.limits() returns an integer vector giving the current settings of the maxima, possibly NA. On 7/18/07, zhihua li [EMAIL PROTECTED] wrote: Thanks for replying! i don't think i'm paging. i tried to use a smaller version of my matrix and do all the checkings as suggested by jim. The smaller matrix caused another problem, for which I've opened another thread. But i've found something about memory that I don't understand. gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 269577 14.45570995 297.6 8919855 476.4 Vcells 3353395 25.69493567 72.5 15666095 119.6 Does this mean the maximum memory I can use for variables is only 120 M? However, when I tried to check the memory limits: mem.limits() nsize vsize NANA Here it seems the maximum memory is not limited? When there is no R function is being executed, I checked the system process by: ps u PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND 7821 0.0 0.1 10048 2336 pts/0Ss Jul18 0:00 -bash 8076 2.9 24.5 523088 504004 pts/0 S+ Jul18 2:46 /usr/lib64/R/bi 8918 1.5 0.1 9912 2328 pts/1Ss 00:44 0:00 -bash 8962 0.0 0.0 3808 868 pts/1R+ 00:45 0:00 ps u Does this mean R is using 25% of my memory? But my RAM is 2 GB and the objects in R only occupy 40 MB from gc(). Did I interpret it wrong? Thanks a lot! From: jim holtman [EMAIL PROTECTED] To: zhihua li [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] memory error with 64-bit R in linux Date: Wed, 18 Jul 2007 17:50:31 -0500 Are you paging? That might explain the long run times. How much space are your other objects taking up? The matrix by itself should only require about 13MB if it is numeric. I would guess it is some of the other objects that you have in your working space. Put some gc() in your loop to see how much space is being used. Run it with a subset of the data and see how long it takes. This might give you an estimate of the time, and space, that might be needed for the entire dataset. Do a 'ps' to see how much memory your process is using. Do one every couple of minutes to see if it is growing. You can alway use Rprof() to get an idea of where time is being spent (use it on a small subset). On 7/18/07, zhihua li [EMAIL PROTECTED] wrote: Hi netters, I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB. The operating system is SUSE 10. The system information is: -uname -a Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux I used heatmap to process a matrix of the dim [16000,100]. After 3 hours of desperating waiting, R told me: cannot allocate vector of size 896 MB. I know the matrix is very big, but since I have 2 GB of RAM and in a 64-bit system, there should be no problem to deal with a vector smaller than 1 GB? (I was not running any other applications in my system) Does anyone know what's going on? Is there a hardware limit where I have to add more RAM, or is there some way to resolve it softwarely? Also is it possible to speed up the computing (I don't wanna wait another 3 hours to know I get another error message) Thank you in advance! _ 享用世界上最大的电子邮件系统― MSN Hotmail。 http://www.hotmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? _ 与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] poor rbind performance
Read the data into a list and then: do.call('rbind', myList) at the end so you do it only once. You are having to reallocate memory each iteration, so no wonder it is slow. On 7/17/07, Aydemir, Zava (FID) [EMAIL PROTECTED] wrote: Hi I rbind data frames in a loop in a cumulative way and the performance detriorates very quickly. My code looks like this: for( k in 1:N) { filename - paste(/tmp/myData_,as.character(k),.txt,sep=) myDataTmp - read.table(filename,header=TRUE,sep=,) if( k == 1) { myData - myDataTmp } else{ myData - rbind(myData,myDataTmp) } } Some more details: - the size of the stored text files is about 100,000 rows and 50 columns each - for k=1: rbind takes 0.0004 seconds - for k=2: rbind takes 13 seconds - for k=3: rbind takes 30 seconds - for k=4: rbind takes 36 seconds etc Any suggestions to improve speed? Thanks Zava This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with length()
POSIXlt is a list structure of 9 elements (see ?POSIXlt). You can see that in the data below: x - as.POSIXlt(c('2007-01-01','2007-02-01','2007-03-31')) length(x) [1] 9 unclass(x) $sec [1] 0 0 0 $min [1] 0 0 0 $hour [1] 0 0 0 $mday [1] 1 1 31 $mon [1] 0 1 2 $year [1] 107 107 107 $wday [1] 1 4 6 $yday [1] 0 31 89 $isdst [1] 0 0 0 attr(,tzone) [1] GMT length(as.POSIXct(x)) [1] 3 What you probably want to do is to use the POSIXct class. On 7/16/07, Jacob Etches [EMAIL PROTECTED] wrote: In the following, can anyone tell me why length(eee) returns 9? I was expecting 15398, and when I try to add this vector to a data frame with that many rows, it fails complaining that the vector is of length 9. In what I thought was an identical situation with a related dataset, the same code worked as expected. length(fff) [1] 15398 str(fff) int [1:15398] 20010102 20010102 20010102 20010103 20010103 20010102 20010102 20010104 20010103 20010102 ... fff[1:12] [1] 20010102 20010102 20010102 20010103 20010103 20010102 20010102 20010104 20010103 20010102 20010105 20010103 eee - as.POSIXlt(strptime(fff,%Y%m%d)) length(eee) [1] 9 eee[1:12] [1] 2001-01-02 2001-01-02 2001-01-02 2001-01-03 2001-01-03 2001-01-02 2001-01-02 2001-01-04 2001-01-03 2001-01-02 2001-01-05 2001-01-03 str(eee) 'POSIXlt', format: chr [1:15398] 2001-01-02 2001-01-02 2001-01-02 2001-01-03 2001-01-03 2001-01-02 2001-01-02 2001-01-04 2001-01-03 ... Many thanks in advance, Jacob Etches Doctoral candidate, Epidemiology Program Department of Public Health Sciences, University of Toronto Faculty of Medicine Research Associate Institute for Work Health 800-481 University Avenue, Toronto, Ontario, Canada M5G 2E9 T: 416.927.2027 ext. 2290 F: 416.927.4167 [EMAIL PROTECTED] www.iwh.on.ca This e-mail may contain confidential information for the sol...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Algorythmic Question on Array Filtration
the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling a list faster
It all depends on what you want to do. In your example, it is faster to first fill in a matrix and then convert the matrix to a list. The problem with filling in the list is that you are dynamically allocating space for each iteration which is probably taking at least an order of magnitude more time than the calculations you are doing. So I just translated your problem into two steps and it takes about 2 seconds on my system. # fill in a matris l - matrix(ncol=3, nrow=10^5) system.time(for(i in (1:10^5)) l[i,] - c(i,i+1,i)) user system elapsed 1.060.001.10 # convert to a list system.time(l.list - lapply(1:10^5, function(i) l[i,])) user system elapsed 0.450.000.46 l.list[1:10] [[1]] [1] 1 2 1 [[2]] [1] 2 3 2 [[3]] [1] 3 4 3 [[4]] [1] 4 5 4 [[5]] [1] 5 6 5 On 7/13/07, Balazs Torma [EMAIL PROTECTED] wrote: hello, first I create a list: l - list(1-c(1,2,3)) then I run the following cycle, it takes over a minute(!) to complete on a very fast mashine: for(i in (1:10^5)) l[[length(l)+1]] - c(i,i+1,i) How can I fill a list faster? (This is just a demo test, the elements of the list are calculated iteratively in an algorithm) Are there any packages and documents on how to use more advanced and fast data structures like linked-lists, hash-tables or trees for example? Thank you, Balazs Torma __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filling a list faster
Actually if you are really interested in the list, then just do the lapply and compute your data; it seems to be even faster than the matrix: system.time(l.1 - lapply(1:10^5, function(i) c(i, i+1, i))) user system elapsed 0.500.000.61 l.1[1:4] [[1]] [1] 1 2 1 [[2]] [1] 2 3 2 [[3]] [1] 3 4 3 [[4]] [1] 4 5 4 On 7/13/07, Philippe Grosjean [EMAIL PROTECTED] wrote: If all the data coming from your iterations are numeric (as in your toy example), why not to use a matrix with one row per iteration? Also, do preallocate the matrix and do not add row or column names before the end of the calculation. Something like: m - matrix(rep(NA, 3*10^5), ncol = 3) system.time(for(i in (1:10^5)) m[i, ] - c(i,i+1,i)) user system elapsed 1.362 0.033 1.424 That is, about 1.5sec on my Intel Duo Core 2.33Mhz MacBook Pro, compared to: l - list(1-c(1,2,3)) system.time(for(i in (1:10^5)) l[[length(l)+1]] - c(i,i+1,i)) user system elapsed 191.629 49.110 248.454 ... more than 4 minutes for your code. By the way, what is your very fast machine, that is actually four times faster than mine (gr!)? Best, Philippe Grosjean ..∞})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Belgium ( ( ( ( ( .. Balazs Torma wrote: hello, first I create a list: l - list(1-c(1,2,3)) then I run the following cycle, it takes over a minute(!) to complete on a very fast mashine: for(i in (1:10^5)) l[[length(l)+1]] - c(i,i+1,i) How can I fill a list faster? (This is just a demo test, the elements of the list are calculated iteratively in an algorithm) Are there any packages and documents on how to use more advanced and fast data structures like linked-lists, hash-tables or trees for example? Thank you, Balazs Torma __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compute rank within factor groups
Is this what you are looking for: x report score 9 ADEA 0.96 8 ADEA 0.90 11 Asylum_FED9 0.86 3 ADEA 0.75 14 Asylum_FED9 0.60 5 ADEA 0.56 13 Asylum_FED9 0.51 16 Asylum_FED9 0.51 2 ADEA 0.42 7 ADEA 0.31 17 Asylum_FED9 0.27 1 ADEA 0.17 4 ADEA 0.17 6 ADEA 0.12 10ADEA 0.11 12 Asylum_FED9 0.10 15 Asylum_FED9 0.09 18 Asylum_FED9 0.07 x$rank - ave(x$score, x$report, FUN=rank) x report score rank 9 ADEA 0.96 10.0 8 ADEA 0.90 9.0 11 Asylum_FED9 0.86 8.0 3 ADEA 0.75 8.0 14 Asylum_FED9 0.60 7.0 5 ADEA 0.56 7.0 13 Asylum_FED9 0.51 5.5 16 Asylum_FED9 0.51 5.5 2 ADEA 0.42 6.0 7 ADEA 0.31 5.0 17 Asylum_FED9 0.27 4.0 1 ADEA 0.17 3.5 4 ADEA 0.17 3.5 6 ADEA 0.12 2.0 10ADEA 0.11 1.0 12 Asylum_FED9 0.10 3.0 15 Asylum_FED9 0.09 2.0 18 Asylum_FED9 0.07 1.0 On 7/12/07, Ken Williams [EMAIL PROTECTED] wrote: Hi, I have a data.frame which is ordered by score, and has a factor column: Browse[1] wc[c(report,score)] report score 9 ADEA 0.96 8 ADEA 0.90 11 Asylum_FED9 0.86 3 ADEA 0.75 14 Asylum_FED9 0.60 5 ADEA 0.56 13 Asylum_FED9 0.51 16 Asylum_FED9 0.51 2 ADEA 0.42 7 ADEA 0.31 17 Asylum_FED9 0.27 1 ADEA 0.17 4 ADEA 0.17 6 ADEA 0.12 10ADEA 0.11 12 Asylum_FED9 0.10 15 Asylum_FED9 0.09 18 Asylum_FED9 0.07 Browse[1] I need to add a column indicating rank within each factor group, which I currently accomplish like so: wc$rank - 0 for(report in as.character(unique(wc$report))) { wc[wc$report==report,]$rank - 1:sum(wc$report==report) } I have to wonder whether there's a better way, something that gets rid of the for() loop using tapply() or by() or similar. But I haven't come up with anything. I've tried these: by(wc, wc$report, FUN=function(pr){pr$rank - 1:nrow(pr)}) by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank - 1:nrow(pr)}) But in both cases the effect of the assignment is lost, there's no $rank column generated for wc. Any suggestions? -Ken __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Is this what you want to do: auto.length - c(12,15,6) for(i in 1:3) { + nam - paste(auto.data,i, sep=.) + assign(nam, as.data.frame(matrix(1:auto.length[i], ncol=3))) + } auto.data.1 V1 V2 V3 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 auto.data.2 V1 V2 V3 1 1 6 11 2 2 7 12 3 3 8 13 4 4 9 14 5 5 10 15 # output the data for(i in 1:3){ + cat(x - paste('auto.data.', i, sep=''), '\n') + print(get(x)) + } auto.data.1 V1 V2 V3 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 auto.data.2 V1 V2 V3 1 1 6 11 2 2 7 12 3 3 8 13 4 4 9 14 5 5 10 15 auto.data.3 V1 V2 V3 1 1 3 5 2 2 4 6 On 7/12/07, Drescher, Michael (MNR) [EMAIL PROTECTED] wrote: Hi All, I want to automatically generate a number of data frames, each with an automatically generated name and an automatically generated number of rows. The number of rows has been calculated before and is different for all data frames (e.g. c(4,5,2)). The number of columns is known a priori and the same for all data frames (e.g. c(3,3,3)). The resulting data frames could look something like this: auto.data.1 X1 X2 X3 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 auto.data.2 X1 X2 X3 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 auto.data.3 X1 X2 X3 1 0 0 0 2 0 0 0 Later, I want to fill the elements of the data frames with values read from somewhere else, automatically looping through the previously generated data frames. I know that I can automatically generate variables with the right number of elements with something like this: auto.length - c(12,15,6) for(i in 1:3) { + nam - paste(auto.data,i, sep=.) + assign(nam, 1:auto.length[i]) + } auto.data.1 [1] 1 2 3 4 5 6 7 8 9 10 11 12 auto.data.2 [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 auto.data.3 [1] 1 2 3 4 5 6 But how do I turn these variables into data frames or give them any dimensions? Any commands such as 'as.matrix', 'data.frame', or 'dim' do not seem to work. I also seem not to be able to access the variables with something like auto.data.i since: auto.data.i Error: object auto.data.i not found Thus, how would I be able to automatically write to the elements of the data frames later in a loop such as ... for(i in 1:3) { + for(j in 1:nrow(auto.data.i)) { ### this obviously does not work since 'Error in nrow(auto.data.i) : object auto.data.i not found' + for(k in 1:ncol(auto.data.i)) { + auto.data.i[j,k] - 'some value' + }}} Thanks a bunch for all your help. Best, Michael Michael Drescher Ontario Forest Research Institute Ontario Ministry of Natural Resources 1235 Queen St East Sault Ste Marie, ON, P6A 2E3 Tel: (705) 946-7406 Fax: (705) 946-2030 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] is.null doesn't work
'v' appears to be a list: v=c(`-`,`+`,1,`^`,`^`,NA,NA,X,9,X,2) i2=16 v[i2] [[1]] NULL str(v) List of 11 $ :function (e1, e2) $ :function (e1, e2) $ : num 1 $ :function (e1, e2) $ :function (e1, e2) $ : logi NA $ : logi NA $ : chr X $ : num 9 $ : chr X $ : num 2 because you used backquotes(`) on the '-'; notice the difference: str(c(`-`,1)) List of 2 $ :function (e1, e2) $ : num 1 str(c('-',1)) chr [1:2] - 1 On 7/12/07, Atte Tenkanen [EMAIL PROTECTED] wrote: Hi, What's wrong here?: v=c(`-`,`+`,1,`^`,`^`,NA,NA,X,9,X,2) i2=16 v[i2] [[1]] NULL is.null(v[i2]) [1] FALSE Is it a bug or have I misunderstood something? Atte Tenkanen University of Turku, Finland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exces return by mktcap decile for each year
to do this. dat - read.table(test.data, header=TRUE) if( new.data %in% ls()) { rm( new.data) } yrs - as.character(unique( dat$yr)) for (y in yrs) { bool - as.character(dat$yr) == y tmp.dat - dat[ bool,] breaks - quantile(tmp.dat$mc, probs=seq(0,1,0.1),na.rm=TRUE) breaks[1] - breaks[1]*.9 # breaks 0, else 1st value not in (a,b] interval cuts - cut(tmp.dat$mc, breaks) means.by.dec - by( tmp.dat$ret, cuts, mean) for ( i in seq(1, dim( tmp.dat)[1])) { tmp.dat[i,dec.mean] - means.by.dec[ cuts[i]] } if(! new.data %in% ls()) { new.data - tmp.dat } else { new.data - rbind( new.data, tmp.dat) } } Here is some test input data in the file test.data - test.data - mc yrret 32902.233 01/01/1995 0.426 15793.691 01/01/1995 0.024 2375.868 01/01/1995 0.660 54586.558 01/01/1996 0.497 10674.900 01/01/1996 0.405 859.656 01/01/1996 -0.033 770.963 01/01/1995 -1.248 423.480 01/01/1995 0.654 2135.504 01/01/1995 0.394 696.599 01/01/1995 -0.482 5115.476 01/01/1995 0.352 821.347 01/01/1995 0.869 43329.695 01/01/1995 0.495 7975.151 01/01/1995 0.112 396.450 01/01/1995 0.956 843.870 01/01/1995 0.172 2727.037 01/01/1995 -0.358 114.584 01/01/1995 -1.015 1347.327 01/01/1995 -0.083 4592.049 01/01/1995 -0.251 674.305 01/01/1995 -0.327 39424.887 01/01/1996 0.198 4447.383 01/01/1996 -0.045 1608.540 01/01/1996 -0.109 217.151 01/01/1996 0.539 1813.320 01/01/1996 0.754 145.170 01/01/1996 0.249 3176.298 01/01/1996 -0.202 14379.686 01/01/1996 0.013 3009.059 01/01/1996 -0.328 1781.406 01/01/1996 -0.158 2576.215 01/01/1996 0.514 1236.317 01/01/1996 0.346 3003.735 01/01/1996 0.151 1544.003 01/01/1996 0.482 7588.657 01/01/1996 0.306 1516.625 01/01/1996 0.183 1596.098 01/01/1996 0.674 2792.192 01/01/1996 0.528 1276.702 01/01/1996 0.010 875.716 01/01/1996 0.189 4858.450 01/01/1995 0.250 2033.623 01/01/1995 -0.582 2164.125 01/01/1995 0.631 Here is the output which looks ok new.data mc yrret dec.mean 1 32902.233 01/01/1995 0.426 0.4605000 2 4858.450 01/01/1995 0.250 0.301 3 2033.623 01/01/1995 -0.582 -0.094 4 2164.125 01/01/1995 0.631 0.6455000 5 15793.691 01/01/1995 0.024 0.068 6 2375.868 01/01/1995 0.660 0.6455000 7770.963 01/01/1995 -1.248 -0.1895000 8423.480 01/01/1995 0.654 0.198 9 2135.504 01/01/1995 0.394 -0.094 10 696.599 01/01/1995 -0.482 -0.4045000 11 5115.476 01/01/1995 0.352 0.301 12 821.347 01/01/1995 0.869 -0.1895000 13 43329.695 01/01/1995 0.495 0.4605000 14 7975.151 01/01/1995 0.112 0.068 15 396.450 01/01/1995 0.956 0.198 16 843.870 01/01/1995 0.172 0.0445000 17 2727.037 01/01/1995 -0.358 -0.3045000 18 114.584 01/01/1995 -1.015 0.198 19 1347.327 01/01/1995 -0.083 0.0445000 20 4592.049 01/01/1995 -0.251 -0.3045000 21 674.305 01/01/1995 -0.327 -0.4045000 22 39424.887 01/01/1996 0.198 0.236 23 4447.383 01/01/1996 -0.045 -0.1235000 24 1608.540 01/01/1996 -0.109 0.162 25 217.151 01/01/1996 0.539 0.2516667 26 1813.320 01/01/1996 0.754 0.162 27 145.170 01/01/1996 0.249 0.2516667 28 3176.298 01/01/1996 -0.202 -0.1235000 29 14379.686 01/01/1996 0.013 0.236 30 3009.059 01/01/1996 -0.328 -0.0885000 31 1781.406 01/01/1996 -0.158 0.162 32 2576.215 01/01/1996 0.514 0.521 33 1236.317 01/01/1996 0.346 0.2675000 34 3003.735 01/01/1996 0.151 -0.0885000 35 1544.003 01/01/1996 0.482 0.578 36 7588.657 01/01/1996 0.306 0.3555000 37 1516.625 01/01/1996 0.183 0.0965000 38 54586.558 01/01/1996 0.497 0.236 39 10674.900 01/01/1996 0.405 0.3555000 40 859.656 01/01/1996 -0.033 0.2516667 41 1596.098 01/01/1996 0.674 0.578 42 2792.192 01/01/1996 0.528 0.521 43 1276.702 01/01/1996 0.010 0.0965000 44 875.716 01/01/1996 0.189 0.2675000 notice that records 1 and 13 fall into the same mc decile for the year 1995, and their ret mean is .4605 and so forth for the other mc deciles in both years. I'd be interested to know if there is a cleaner way to do this. Thanks. Frank TV dinner still cooling? Check out Tonight's Picks on Yahoo! TV. http://tv.yahoo.com/ -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split graphs
How many columns do you have? Is it 2 or 1000; can not tell from your email. A histogram of 2 values does not seem meaningful. Do you want 1000 separate histograms, one per page, or multiple per page? Yes you can do it, the question is what/how do you want to do it. On 7/9/07, tian shen [EMAIL PROTECTED] wrote: Hello All, I have a question, which somehow I think it is easy, however, I just couldn't get it. I want to histogram each row of a 1000*2 matrix( means it has 1000 rows), and I want to see those 1000 pictures together. How can I do this? Am I able to split a graph into 1000 parts and in each parts it contains a histogram for one row? Thank you very much Jessie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parsing strings
Is this what you want: x - A10B10A10 B5AB 10 CD 12A10CD2EF3 x - gsub( , , x) # remove blanks y - gregexpr([A-Z]+\\s*[0-9]+, x )[[1]] substring(x, y, y + attr(y, 'match.length') - 1) [1] A10 B10 A10 B5 AB10 CD12 A10 CD2 EF3 On 7/9/07, Drescher, Michael (MNR) [EMAIL PROTECTED] wrote: Hi All, I have strings made up of an unknown number of letters, digits, and spaces. Strings always start with one or two letters, and always end with one or two digits. A set of letters (one or two letters) is always followed by a set of digits (one or two digits), possibly with one or more spaces between the sets of letters and digits. A set of letters always belongs to the following set of digits and I want to parse the strings into these groups. As an example, the strings and the desired parsing results could look like this: A10B10, desired parsing result: A10 and B10 A10 B5, desired parsing result: A10 and B5 AB 10 CD 12, desired parsing result: AB10 and CD12 A10CD2EF3, desired parsing result: A10, CD2, and EF3 I assume that it is possible to search a string for letters and digits and then break the string where letters are followed by digits, however I am a bit clueless about how I could use, e.g., the 'charmatch' or 'parse' commands to achieve this. Thanks a lot in advance for your help. Best, Michael Michael Drescher Ontario Forest Research Institute Ontario Ministry of Natural Resources 1235 Queen St East Sault Ste Marie, ON, P6A 2E3 Tel: (705) 946-7406 Fax: (705) 946-2030 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making groups
It would be nice if you could supply an example of what your input looks like and then what you would like your output to look like. You would probably use 'tapply', but I would have to see what you data looks like. On 7/9/07, Mag. Ferri Leberl [EMAIL PROTECTED] wrote: Dear everybody! If I have an array of numbers e.g. the points my students got at an examination, and a key to group the numbers, e.g. the key which interval corresponds with which mark (two arrays of the same length or one 2x(number of marks)), how can I get the array of absolute frequencies of marks? I hope I have expressed my problem clearly. Thank you in advance. Mag. Ferri Leberl __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one question about the loop
It is part of the standard 'util' library that comes with R ?combn help.search('combination') On 7/8/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: jim holtman [EMAIL PROTECTED] a Ã(c)crit : Is this what you want? t(combn(5,2)) Well, it seems nice, but from which library does it come ? I try help.search(combn), but that did not give me any valuable information... Christophe Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.