Re: [R] Spliting a huge vector
Dear R users, Sorry, I made a mistake in my specification of problem, should be: # my vector is this a.vector - seq(2, by=5, length=1000) # so my cut values are cut.values - c(30, 50, 100, 109, 300, 601, 803, 1000) # and x should have an extra ) at the end x - rep(1:length(cut.values), times=diff(c(0, cut.values))) Thanks for all the responses so far. Any additional responses are welcome. Many thanks, Dave --- Dave Evens [EMAIL PROTECTED] wrote: Dear R users, I have a huge vector that I would like to split into unequal slices. However, the only way I can do this is to create another huge vector to define the groups that are used to split the original vector, e.g. # my vector is this a.vector - seq(2, by=5, length=100) # indices where I would like to slice my vector cut.values - c(30, 50, 100, 109, 300, 601, 803) # so I have to create another vector of similar length # to use the split() command, i.e. x - rep(1:length(cut.values), times=diff(c(0, cut.values)) # this means I can use split() split(a.vector, x) This seems to be a waste in terms of memory usage as I'm creating another vector (here x) to split the original vector. Is there a better way to split a huge vector than this? Any help is much appreciated. Best, Dave. __ Do You Yahoo!? protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Spliting a huge vector
Dear R users, I have a huge vector that I would like to split into unequal slices. However, the only way I can do this is to create another huge vector to define the groups that are used to split the original vector, e.g. # my vector is this a.vector - seq(2, by=5, length=100) # indices where I would like to slice my vector cut.values - c(30, 50, 100, 109, 300, 601, 803) # so I have to create another vector of similar length # to use the split() command, i.e. x - rep(1:length(cut.values), times=diff(c(0, cut.values)) # this means I can use split() split(a.vector, x) This seems to be a waste in terms of memory usage as I'm creating another vector (here x) to split the original vector. Is there a better way to split a huge vector than this? Any help is much appreciated. Best, Dave. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lapply question
Dear members, I have numerous arrays that are organised in a list. For example, suppose I have 2 arrays in a list called alist alist - list(array(rpois(12,5), 6:8) , array(rpois(15,5), 10:12)) with array dimnames dimnames(alist[[1]]) - list(LETTERS[1:6], paste(namesd, 1:7, sep=), paste(namese, 1:8, sep=)) dimnames(alist[[2]]) - list(LETTERS[7:16], paste(namesf, 1:11, sep=), paste(namesg, 1:12, sep=)) I would like to use the lapply function to produce a report with: Array 1 Dimension name: namese1 Row Value Value-Average(excluding Value) Aalist[[1]][1,1,1] alist[[1]][1,1,1]-mean(alist[[1]][1,-1,1]) ...etc for all elements in the first row on the array Balist[[1]][2,1,1] alist[[1]][2,1,1]-mean(alist[[1]][2,-1,1]) ...etc Dimension name: namese2 Dimension name: namese8 ... Array 2 Dimension name: namesg1 Dimension name: namesg12 __ Can I use the apply to do this, something like lapply(alist, function(k), apply(k, c(1,3), ... but how do I layout the report using the array names, dimension names etc and with each observation on a separate line? Is it possible to give apply an array and output a list? Thanks for any help in advance. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reading non-existing files
Dear all, I'm trying to read to a collection of files in a loop using odbcConnectExcel - but not all of the files exist. This is the code I have for(i in 1:no.of.subs){ channel - odbcConnectExcel(paste(working.dir, subs[i], .xls, sep=)) datafiles[[i]] - as.matrix(sqlFetch(channel, Data)) close(channel) } I'm not sure how to alter the code to allow for the fact that some files may not exist - these files should be ignored. Currently, I get the following error Error in odbcTableExists(channel, sqtable) : 'Data': table not found on channel - it however creates an empty file for the first occurance of a non-existing file then stops. I would very much apprepriate any help. Thanks in advance. Dave __ Get on-the-go sports scores, stock quotes, news and more. Check it out! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reading multiple files
Dear All, How do I read in multiple data frames or matrices in a loop, e.g. for (i in 1:n) { channel - odbcConnectExcel(filenames) file[i] - as.data.frame(sqlFetch(channel, sheet)) } I would like file[i] to be the name of the data.frame (i.e. file[1], file[2], file[3],...etc) rather than a vector. Thanks in advance for any help. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with RODBC
Dear all, I'm reading data via the RODBC connection using odbcConnectExcel. I use sqlFetch(channel, sheetx) to identify the correct tab. It appears to read the data without any problems. However, when I exact a portion of data - the row number specified is 1 less than in the actual excel file and it can't read any columns after the 94th column. Can someone help me? TIA Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] grubbs.test
Dear All, I have small samples of data (between 6 and 15) for numerious time series points. I am assuming the data for each time point is normally distributed. The problem is that the data arrvies sporadically and I would like to detect the number of outliers after I have six data points for any time period. Essentially, I would like to detect the number of outliers when I have 6 data points then test whether there are any ouliers. If so, remove the outliers, and wait until I have at least 6 data points or when the sample size increases and test again whether there are any outliers. This process is repeated until there are no more data points to add to the sample. Is it valid to use the grubbs.test in this way? If not, are there any tests out there that might be appropriate for this situation? Rosner's test required that I have at least 25 data points which I don't have. Thank you in advance for any help. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Importing data into R
I have a highly formated Excel with multiple tabs. Is it currently possible to read this data into R without changing the format of the Excel file? Also, is it possible to write back to the same Excel file or at least create a new Excel file with the same formatting as before with modified data which has been processed in R. Thanks in advance for any help that you can provide. Dave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html