Re: [R] random sampling with some limitive conditions?
## ## Licence : GPL Version 2 ## ## ## ## Copyright (c) Gavin L. Simpson, 2007 ## ## ## ## History : 27-April-2007 - Created ## ## ## ### rBinMat - function(x, burn.in = 1, skip = 1000) { dim.nams - dimnames(x) x - as.matrix(x) dimnames(x) - NULL ## number rows/cols n.col - ncol(x) n.row - nrow(x) ## is the 2x2 matrix diagonal or anti-diagonal isDiag - function(x) { X - as.vector(x) if(.Internal(identical(X, c(1,0,0,1 return(TRUE) else if(.Internal(identical(X, c(0,1,1,0 return(TRUE) else return(FALSE) } changed - 0 ## do the burn in changes, then skip, then a final swap, ## this is the first random matrix while(changed = (burn.in + skip + 1)) { ran.row - .Internal(sample(n.row, 2, replace = FALSE, prob = NULL)) ran.col - .Internal(sample(n.col, 2, replace = FALSE, prob = NULL)) if(isDiag(x[ran.row, ran.col])) { x[ran.row, ran.col] - c(1,0)[x[ran.row, ran.col] + 1] changed - changed + 1} } dimnames(x) - dim.nams return(x) } On Sun, 2007-07-08 at 11:12 -0600, Zhang Jian wrote: It is not right. My data is the presence-absence data. And I want to get thousands of presence-absence random data which length of rows and columns is the same with the former data. Meantime, the new data needs to have the fixed sums for each row and column with the former data. For example: The data sites: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites,2,sum) site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites,1,sum) [1] 3 5 3 3 2 6 1 3 If I get the new data sites.random: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 1 1 1 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites.random,2,sum) # the same with the former data site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites.random,1,sum) # the same with the former data [1] 3 5 3 3 2 6 1 3 How can I get the new random data? Thanks. On 7/8/07, Alan Zaslavsky [EMAIL PROTECTED] wrote: If I understand your problem, this might be a solution. Assign independent random numbers for row and column and use the corresponding ordering to assign the row and column indices. Thus row and column assignments are independent and the row and column totals are fixed. If cc and rr are respectively the desired row and column totals, with sum(cc)==sum(rr), then n = sum(cc) row.assign = rep(1:length(rr),rr)[order(runif(n))] col.assign = rep(1:length(cc),cc)[order(runif(n))] If you want many such sets of random assignments to be generated at once you can use a few more rep() calls in the expressions to generate multiple sets in the same way. (Do you actually want the assignments or just the tables?) Of course there are many other possible solutions since you have not fully specified the distribution you want. Alan Zaslavsky Harvard U From: Zhang Jian [EMAIL PROTECTED] Subject: [R] random sampling with some limitive conditions? To: r-help r-help@stat.math.ethz.ch I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R
Re: [R] random sampling with some limitive conditions?
The method can get one new data. But I think that it is not random. I used the new random data to compute the index which I want to get. The same value was achieved with the data sites. I try it again and again. The result is the same. So I think I need to find one new random sampling method. On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Zhang Jian Sent: Saturday, July 07, 2007 12:31 PM To: r-help Subject: [R] random sampling with some limitive conditions? I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total. Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with. Here is some example code which is neither efficient nor general, but does demonstrate the basic idea. You will need to decide if this approach meets you needs. # I created a data file with your table (8x8) and read from it sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE) sites # get row and column totals colsums - apply(sites,2,sum) rowsums - apply(sites,1,sum) # randomly swap columns for(i in 1:8) { if (runif(1) .5) { swapcol-sample(which(colsums==colsums[i]),1) temp-sites[,swapcol] sites[,swapcol]-sites[,i] sites[,i]-temp } } # randomly swap rows for(i in 1:8) { if (runif(1) .5) { swaprow-sample(which(rowsums==rowsums[i]),1) temp-sites[swaprow,] sites[swaprow,]-sites[i,] sites[i,]-temp } } sites Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
Any methods or advices about the random sampling method? I have no idea. Thanks a lot. On 7/8/07, Zhang Jian [EMAIL PROTECTED] wrote: The method can get one new data. But I think that it is not random. I used the new random data to compute the index which I want to get. The same value was achieved with the data sites. I try it again and again. The result is the same. So I think I need to find one new random sampling method. On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Zhang Jian Sent: Saturday, July 07, 2007 12:31 PM To: r-help Subject: [R] random sampling with some limitive conditions? I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total. Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with. Here is some example code which is neither efficient nor general, but does demonstrate the basic idea. You will need to decide if this approach meets you needs. # I created a data file with your table (8x8) and read from it sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE) sites # get row and column totals colsums - apply(sites,2,sum) rowsums - apply(sites,1,sum) # randomly swap columns for(i in 1:8) { if (runif(1) .5) { swapcol-sample(which(colsums==colsums[i]),1) temp-sites[,swapcol] sites[,swapcol]-sites[,i] sites[,i]-temp } } # randomly swap rows for(i in 1:8) { if (runif(1) .5) { swaprow-sample(which(rowsums==rowsums[i]),1) temp-sites[swaprow,] sites[swaprow,]-sites[i,] sites[i,]-temp } } sites Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random sampling with some limitive conditions?
If I understand your problem, this might be a solution. Assign independent random numbers for row and column and use the corresponding ordering to assign the row and column indices. Thus row and column assignments are independent and the row and column totals are fixed. If cc and rr are respectively the desired row and column totals, with sum(cc)==sum(rr), then n = sum(cc) row.assign = rep(1:length(rr),rr)[order(runif(n))] col.assign = rep(1:length(cc),cc)[order(runif(n))] If you want many such sets of random assignments to be generated at once you can use a few more rep() calls in the expressions to generate multiple sets in the same way. (Do you actually want the assignments or just the tables?) Of course there are many other possible solutions since you have not fully specified the distribution you want. Alan Zaslavsky Harvard U From: Zhang Jian [EMAIL PROTECTED] Subject: [R] random sampling with some limitive conditions? To: r-help r-help@stat.math.ethz.ch I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
It is not right. My data is the presence-absence data. And I want to get thousands of presence-absence random data which length of rows and columns is the same with the former data. Meantime, the new data needs to have the fixed sums for each row and column with the former data. For example: The data sites: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites,2,sum) site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites,1,sum) [1] 3 5 3 3 2 6 1 3 If I get the new data sites.random: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 1 1 1 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites.random,2,sum) # the same with the former data site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites.random,1,sum) # the same with the former data [1] 3 5 3 3 2 6 1 3 How can I get the new random data? Thanks. On 7/8/07, Alan Zaslavsky [EMAIL PROTECTED] wrote: If I understand your problem, this might be a solution. Assign independent random numbers for row and column and use the corresponding ordering to assign the row and column indices. Thus row and column assignments are independent and the row and column totals are fixed. If cc and rr are respectively the desired row and column totals, with sum(cc)==sum(rr), then n = sum(cc) row.assign = rep(1:length(rr),rr)[order(runif(n))] col.assign = rep(1:length(cc),cc)[order(runif(n))] If you want many such sets of random assignments to be generated at once you can use a few more rep() calls in the expressions to generate multiple sets in the same way. (Do you actually want the assignments or just the tables?) Of course there are many other possible solutions since you have not fully specified the distribution you want. Alan Zaslavsky Harvard U From: Zhang Jian [EMAIL PROTECTED] Subject: [R] random sampling with some limitive conditions? To: r-help r-help@stat.math.ethz.ch I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random sampling with some limitive conditions?
I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zhang Jian Sent: Saturday, July 07, 2007 12:31 PM To: r-help Subject: [R] random sampling with some limitive conditions? I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total. Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with. Here is some example code which is neither efficient nor general, but does demonstrate the basic idea. You will need to decide if this approach meets you needs. # I created a data file with your table (8x8) and read from it sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE) sites # get row and column totals colsums - apply(sites,2,sum) rowsums - apply(sites,1,sum) # randomly swap columns for(i in 1:8) { if (runif(1) .5) { swapcol-sample(which(colsums==colsums[i]),1) temp-sites[,swapcol] sites[,swapcol]-sites[,i] sites[,i]-temp } } # randomly swap rows for(i in 1:8) { if (runif(1) .5) { swaprow-sample(which(rowsums==rowsums[i]),1) temp-sites[swaprow,] sites[swaprow,]-sites[i,] sites[i,]-temp } } sites Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.