[R] enumerating non-overlapping pairs of elements from a vector
Hi All, I'm trying to come up with a clear and concise (and fast?) solution to the following problem. I would like to take a vector 'v' and enumerate all of the ways in which it can be broken into n sets of length 2 (if the length of the vector is odd, and an additional set of length 1). An element of 'v' can only appear in one set. Order within sets is not important. Vector 'v' can be of lengths 2-12 'n' is determined by length(v)%/%2 if length(v)%%2 is non-zero, the additional set of length 1 is used For example vector 'v': v = (1,2,3,4) The solution would be (rows are combinations of sets chosen, where each element only appears once) 1 2, 3 4 1 3, 2 4 1 4, 2 3 In the case where length(v) is odd v = (1,2,3,4,5) 1 2, 3 4, 5 1 3, 2 4, 5 1 4, 2 3, 5 5 2, 3 4, 1 5 3, 2 4, 1 5 4, 2 3, 1 5 1, 3 4, 2 5 3, 1 4, 2 5 4, 1 3, 2 and so on... Certainly pulling all combinations of two or one elements is not a big deal, for example combinations(5,2,c(1,2,3,4,5),repeats.allowed=T) from the 'gtools' package would do something like this. I'm stuck on a clean solution for enumerating all the non-overlapping sets without some elaborate looping and checking scheme. No doubt this is a lapse in my understanding of combinatorics. Any help would be greatly appreciated cheers, a. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] recategorizing a vector into discrete states
Hi All, I'm trying to take a numerical vector and produce a new vector of the same length where each element in the first is placed into a category given by a 'breaks-like' vector. The values in the result should equal the lower bounds of each category as defined in the breaks vector. I suspect that a vectorized solution is pretty simple, but I can't seem to figure it out today. Here is an example of my problem: Vector 'a' is the original vector. Vector 'b' gives the lower bounds of the categories. Vector 'c' is the result I am seeking. a - c(0.9, 11, 1.2, 2.4, 4.0, 5.0, 7.3, 8.1, 3.3, 4.5) b - c(0, 2, 4, 6, 8) c - c(0, 8, 0, 2, 4, 4, 6, 8, 2, 4) Any suggestions would be greatly appreciated. cheers, Allan Strand __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] rmetasim: a population genetic simulation environment
Hi all, My student, James Niehaus, and I have been working on an individual-based population genetic simulation package in R. Currently, the package is still rough, but useful enough that it may be of interest to population/ecological geneticists. The best description of the basic model (implemented in C++) can be found in: Allan E. Strand. Metasim 1.0: an individual-based environment for simulating population genetics of complex population dynamics. Mol. Ecol. Notes, 2:373-376, 2002. My idea was to produce an extremely flexible engine that could simulate genotypic data that result from most any demographic scenario. These data can be used as null distributions to compare to observed datasets. Results of simulations can be exported to a variety of canned population genetic analysis programs, and rmetasim implements a few rudimentary analyses (e.g WeirCockerhams theta, mismatch distributions, and assignment tests) as well. Several example sessions are included in pdf files found in the rmetasim/doc subdirectory. rmetasim is mostly a wrapper for the C++ engine described in the paper cited above. Because it simulates individuals directly, rmetasim is not terribly fast, even though the majority of the processing occurs in compiled code. Nevertheless, I have found it useful, and running multiple simulations from the same starting conditions seems to work in a cluster environment using Rmpi. We are actively working on this package, and would appreciate feedback so that we can improve its quality. The source distribution can be found at: http://linum.cofc.edu:/filedrop/rmetasim_0.0.3.tar.gz It should compile on linux boxes (that have R installed), though doing so takes a while. Binary distribution for Mac OS X: http://linum.cofc.edu:/filedrop/rmetasim_0.0.3_R_powerpc-apple-darwin6.8.tar.gz Binary distribution for Windows: http://linum.cofc.edu:/filedrop/rmetasim_0.0.3.zip cheers, a. -- Allan Strand, Biologyhttp://linum.cofc.edu College of Charleston Ph. (843) 953-9189 Charleston, SC 29424 Fax (843) 953-9199 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vectorized leave one out analyses
Hi all, I'm implementing a population genetic statistic that requires repeated re-estimation of population parameters after a single observation has been left out. It seems to me that one could: a) use looping in R, b) use a vectorized approach in R, c) loop in a dynamically loaded c-function, d) or use an existing jackknife routine. an untested skeleton of the code for 'a': foo - function(datfrm) { retvec - rep(0,nrow(datfrm)) selvec - rep(T,nrow(datfrm)) for (i in 1:nrow(datfrm)) { selvec[i] - F retvec[i] - popstat(datfrm[selvec]) selvec[i] - T } retvec } I suppose that 'd' is the easiest option if such a routine exists, but I have not come across one by means of an archive search. I'd like to avoid 'a' because of efficiency, and 'c' because of additional coding and linking steps. I like the idea of 'b' because it would be nifty and likely fast, though there may be memory issues. I'm sure that this is a general problem that somebody has solved in an elegant fashion. I'm just looking for the solution. -- Allan Strand, Biologyhttp://linum.cofc.edu College of Charleston Ph. (843) 953-9189 Charleston, SC 29424 Fax (843) 953-9199 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help