Re: [R] random sampling with some limitive conditions?

2007-07-09 Thread Gavin Simpson
   ##
## Licence : GPL Version 2   ##
##   ##
## Copyright (c) Gavin L. Simpson, 2007  ##
##   ##
## History : 27-April-2007 - Created ##
##   ##
###
rBinMat - function(x, burn.in = 1, skip = 1000) {
  dim.nams - dimnames(x)
  x - as.matrix(x)
  dimnames(x) - NULL
  ## number rows/cols
  n.col - ncol(x)
  n.row - nrow(x)
  ## is the 2x2 matrix diagonal or anti-diagonal
  isDiag - function(x) {
X - as.vector(x)
if(.Internal(identical(X, c(1,0,0,1
  return(TRUE)
else if(.Internal(identical(X, c(0,1,1,0
  return(TRUE)
else
  return(FALSE)
  }
  changed - 0
  ## do the burn in changes, then skip, then a final swap,
  ## this is the first random matrix
  while(changed = (burn.in + skip + 1)) {
ran.row - .Internal(sample(n.row, 2, replace = FALSE, prob = NULL))
ran.col - .Internal(sample(n.col, 2, replace = FALSE, prob = NULL))
if(isDiag(x[ran.row, ran.col])) {
  x[ran.row, ran.col] - c(1,0)[x[ran.row, ran.col] + 1]
  changed - changed + 1}
  }
  dimnames(x) - dim.nams
  return(x)
}


On Sun, 2007-07-08 at 11:12 -0600, Zhang Jian wrote:
 It is not right. My data is the presence-absence data. And I want to get
 thousands of presence-absence random data which length of rows and columns
 is the same with the former data. Meantime, the new data needs to have the
 fixed sums for each row and column with the former data.
 For example:
 The data sites:
 site1 site2 site3 site4 site5 site6 site7 site8
 1 0 0 0 1 1 0 0
 0 1 1 1 0 1 0 1
 1 0 0 0 1 0 1 0
 0 0 0 1 0 1 0 1
 1 0 1 0 0 0 0 0
 0 1 0 1 1 1 1 1
 1 0 0 0 0 0 0 0
 0 0 0 1 0 1 0 1
  apply(sites,2,sum)
 site1 site2 site3 site4 site5 site6 site7 site8
 4 2 2 4 3 5 2 4
  apply(sites,1,sum)
 [1] 3 5 3 3 2 6 1 3
 
 If I get the new data sites.random:
 site1 site2 site3 site4 site5 site6 site7 site8
 1 0 0 0 1 1 0 0
 1 1 1 1 0 1 0 0
 1 0 0 0 1 0 1 0
 0 0 0 1 0 1 0 1
 0 0 1 0 0 0 0 1
 0 1 0 1 1 1 1 1
 1 0 0 0 0 0 0 0
 0 0 0 1 0 1 0 1
  apply(sites.random,2,sum) # the same with the former data
 site1 site2 site3 site4 site5 site6 site7 site8
 4 2 2 4 3 5 2 4
  apply(sites.random,1,sum) # the same with the former data
 [1] 3 5 3 3 2 6 1 3
 
 How can I get the new random data? Thanks.
 
 
 
 On 7/8/07, Alan Zaslavsky [EMAIL PROTECTED] wrote:
 
  If I understand your problem, this might be a solution.  Assign
  independent random numbers for row and column and use the corresponding
  ordering to assign the row and column indices.  Thus row and column
  assignments are independent and the row and column totals are fixed.  If
  cc and rr are respectively the desired row and column totals, with
  sum(cc)==sum(rr), then
 
  n = sum(cc)
  row.assign = rep(1:length(rr),rr)[order(runif(n))]
  col.assign = rep(1:length(cc),cc)[order(runif(n))]
 
  If you want many such sets of random assignments to be generated at once
  you can use a few more rep() calls in the expressions to generate multiple
  sets in the same way.  (Do you actually want the assignments or just the
  tables?) Of course there are many other possible solutions since you have
  not fully specified the distribution you want.
 
 Alan Zaslavsky
 Harvard U
 
   From: Zhang Jian [EMAIL PROTECTED]
   Subject: [R] random sampling with some limitive conditions?
   To: r-help r-help@stat.math.ethz.ch
  
   I want to gain thousands of random sampling data by randomizing the
   presence-absence data. Meantime, one important limition is that the row
  and
   column sums must be fixed. For example, the data tst is following:
  site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1
  1 0
   1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1
  0 0
   0 0 0 0 0 0 0 0 1 0 1 0 1
  
   sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data,
  the
   first row sums must equal to 3, and the first column sums must equal to
  4.
   The rules need to be applied to each row and column.
   How to get the new random sampling data? I have no idea.
   Thanks.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R

Re: [R] random sampling with some limitive conditions?

2007-07-08 Thread Zhang Jian
The method can get one new data. But I think that it is not random. I used
the new random data to compute the index which I want to get. The
same value was achieved with the data sites.
I try it again and again. The result is the same.
So I think I need to find one new random sampling method.



On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote:

  -Original Message-
  From: [EMAIL PROTECTED] [mailto:
 [EMAIL PROTECTED]
  On Behalf Of Zhang Jian
  Sent: Saturday, July 07, 2007 12:31 PM
  To: r-help
  Subject: [R] random sampling with some limitive conditions?
 
  I want to gain thousands of random sampling data by randomizing the
  presence-absence data. Meantime, one important limition is that the row
 and
  column sums must be fixed. For example, the data tst is following:
 site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1
 1 0
  1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1
 0 0
  0 0 0 0 0 0 0 0 1 0 1 0 1
 
  sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data,
 the
  first row sums must equal to 3, and the first column sums must equal to
 4.
  The rules need to be applied to each row and column.
  How to get the new random sampling data? I have no idea.
  Thanks.
 

 You could reorder your table by stepping through your table a column at a
 time, and for each column randomly deciding to swap the current column with
 a column that has the same column total.  Repeat this process for each row,
 i.e. for each row, randomly choose a row with the same row total to swap
 with.

 Here is some example code which is neither efficient nor general, but does
 demonstrate the basic idea.  You will need to decide if this approach meets
 you needs.

 # I created a data file with your table (8x8) and read from it
 sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE)
 sites
 # get row and column totals
 colsums - apply(sites,2,sum)
 rowsums - apply(sites,1,sum)
 # randomly swap columns
 for(i in 1:8) {
 if (runif(1)  .5) {
swapcol-sample(which(colsums==colsums[i]),1)
temp-sites[,swapcol]
sites[,swapcol]-sites[,i]
sites[,i]-temp
}
 }
 # randomly swap rows
 for(i in 1:8) {
 if (runif(1)  .5) {
swaprow-sample(which(rowsums==rowsums[i]),1)
temp-sites[swaprow,]
sites[swaprow,]-sites[i,]
sites[i,]-temp
}
 }
 sites


 Hope this is helpful,

 Dan

 Daniel Nordlund
 Bothell, WA USA





[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random sampling with some limitive conditions?

2007-07-08 Thread Zhang Jian
Any methods or advices about the random sampling method?
I have no idea.
Thanks a lot.


On 7/8/07, Zhang Jian [EMAIL PROTECTED] wrote:

 The method can get one new data. But I think that it is not random. I used
 the new random data to compute the index which I want to get. The
 same value was achieved with the data sites.
 I try it again and again. The result is the same.
 So I think I need to find one new random sampling method.



 On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote:
 
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:
  [EMAIL PROTECTED]
   On Behalf Of Zhang Jian
   Sent: Saturday, July 07, 2007 12:31 PM
   To: r-help
   Subject: [R] random sampling with some limitive conditions?
  
   I want to gain thousands of random sampling data by randomizing the
   presence-absence data. Meantime, one important limition is that the
  row and
   column sums must be fixed. For example, the data tst is following:
  site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1
  1 1 0
   1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1
  1 0 0
   0 0 0 0 0 0 0 0 1 0 1 0 1
  
   sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the
  data, the
   first row sums must equal to 3, and the first column sums must equal
  to 4.
   The rules need to be applied to each row and column.
   How to get the new random sampling data? I have no idea.
   Thanks.
  
 
  You could reorder your table by stepping through your table a column at
  a time, and for each column randomly deciding to swap the current column
  with a column that has the same column total.  Repeat this process for each
  row, i.e. for each row, randomly choose a row with the same row total to
  swap with.
 
  Here is some example code which is neither efficient nor general, but
  does demonstrate the basic idea.  You will need to decide if this approach
  meets you needs.
 
  # I created a data file with your table (8x8) and read from it
  sites - read.table(c:/R/R-examples/site_random_sample.txt,
  header=TRUE)
  sites
  # get row and column totals
  colsums - apply(sites,2,sum)
  rowsums - apply(sites,1,sum)
  # randomly swap columns
  for(i in 1:8) {
  if (runif(1)  .5) {
 swapcol-sample(which(colsums==colsums[i]),1)
 temp-sites[,swapcol]
 sites[,swapcol]-sites[,i]
 sites[,i]-temp
 }
  }
  # randomly swap rows
  for(i in 1:8) {
  if (runif(1)  .5) {
 swaprow-sample(which(rowsums==rowsums[i]),1)
 temp-sites[swaprow,]
 sites[swaprow,]-sites[i,]
 sites[i,]-temp
 }
  }
  sites
 
 
  Hope this is helpful,
 
  Dan
 
  Daniel Nordlund
  Bothell, WA USA
 
 
 
 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random sampling with some limitive conditions?

2007-07-08 Thread Alan Zaslavsky
If I understand your problem, this might be a solution.  Assign 
independent random numbers for row and column and use the corresponding 
ordering to assign the row and column indices.  Thus row and column 
assignments are independent and the row and column totals are fixed.  If 
cc and rr are respectively the desired row and column totals, with 
sum(cc)==sum(rr), then

n = sum(cc)
row.assign = rep(1:length(rr),rr)[order(runif(n))]
col.assign = rep(1:length(cc),cc)[order(runif(n))]

If you want many such sets of random assignments to be generated at once 
you can use a few more rep() calls in the expressions to generate multiple 
sets in the same way.  (Do you actually want the assignments or just the 
tables?) Of course there are many other possible solutions since you have 
not fully specified the distribution you want.

Alan Zaslavsky
Harvard U

 From: Zhang Jian [EMAIL PROTECTED]
 Subject: [R] random sampling with some limitive conditions?
 To: r-help r-help@stat.math.ethz.ch
 
 I want to gain thousands of random sampling data by randomizing the
 presence-absence data. Meantime, one important limition is that the row and
 column sums must be fixed. For example, the data tst is following:
site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
 0 0 0 0 0 0 0 0 1 0 1 0 1
 
 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
 first row sums must equal to 3, and the first column sums must equal to 4.
 The rules need to be applied to each row and column.
 How to get the new random sampling data? I have no idea.
 Thanks.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random sampling with some limitive conditions?

2007-07-08 Thread Zhang Jian
It is not right. My data is the presence-absence data. And I want to get
thousands of presence-absence random data which length of rows and columns
is the same with the former data. Meantime, the new data needs to have the
fixed sums for each row and column with the former data.
For example:
The data sites:
site1 site2 site3 site4 site5 site6 site7 site8
1 0 0 0 1 1 0 0
0 1 1 1 0 1 0 1
1 0 0 0 1 0 1 0
0 0 0 1 0 1 0 1
1 0 1 0 0 0 0 0
0 1 0 1 1 1 1 1
1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1
 apply(sites,2,sum)
site1 site2 site3 site4 site5 site6 site7 site8
4 2 2 4 3 5 2 4
 apply(sites,1,sum)
[1] 3 5 3 3 2 6 1 3

If I get the new data sites.random:
site1 site2 site3 site4 site5 site6 site7 site8
1 0 0 0 1 1 0 0
1 1 1 1 0 1 0 0
1 0 0 0 1 0 1 0
0 0 0 1 0 1 0 1
0 0 1 0 0 0 0 1
0 1 0 1 1 1 1 1
1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1
 apply(sites.random,2,sum) # the same with the former data
site1 site2 site3 site4 site5 site6 site7 site8
4 2 2 4 3 5 2 4
 apply(sites.random,1,sum) # the same with the former data
[1] 3 5 3 3 2 6 1 3

How can I get the new random data? Thanks.



On 7/8/07, Alan Zaslavsky [EMAIL PROTECTED] wrote:

 If I understand your problem, this might be a solution.  Assign
 independent random numbers for row and column and use the corresponding
 ordering to assign the row and column indices.  Thus row and column
 assignments are independent and the row and column totals are fixed.  If
 cc and rr are respectively the desired row and column totals, with
 sum(cc)==sum(rr), then

 n = sum(cc)
 row.assign = rep(1:length(rr),rr)[order(runif(n))]
 col.assign = rep(1:length(cc),cc)[order(runif(n))]

 If you want many such sets of random assignments to be generated at once
 you can use a few more rep() calls in the expressions to generate multiple
 sets in the same way.  (Do you actually want the assignments or just the
 tables?) Of course there are many other possible solutions since you have
 not fully specified the distribution you want.

Alan Zaslavsky
Harvard U

  From: Zhang Jian [EMAIL PROTECTED]
  Subject: [R] random sampling with some limitive conditions?
  To: r-help r-help@stat.math.ethz.ch
 
  I want to gain thousands of random sampling data by randomizing the
  presence-absence data. Meantime, one important limition is that the row
 and
  column sums must be fixed. For example, the data tst is following:
 site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1
 1 0
  1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1
 0 0
  0 0 0 0 0 0 0 0 1 0 1 0 1
 
  sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data,
 the
  first row sums must equal to 3, and the first column sums must equal to
 4.
  The rules need to be applied to each row and column.
  How to get the new random sampling data? I have no idea.
  Thanks.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random sampling with some limitive conditions?

2007-07-07 Thread Zhang Jian
I want to gain thousands of random sampling data by randomizing the
presence-absence data. Meantime, one important limition is that the row and
column sums must be fixed. For example, the data tst is following:
   site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 1 0 1 0 1

sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
first row sums must equal to 3, and the first column sums must equal to 4.
The rules need to be applied to each row and column.
How to get the new random sampling data? I have no idea.
Thanks.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random sampling with some limitive conditions?

2007-07-07 Thread Daniel Nordlund
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Zhang Jian
 Sent: Saturday, July 07, 2007 12:31 PM
 To: r-help
 Subject: [R] random sampling with some limitive conditions?
 
 I want to gain thousands of random sampling data by randomizing the
 presence-absence data. Meantime, one important limition is that the row and
 column sums must be fixed. For example, the data tst is following:
site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
 0 0 0 0 0 0 0 0 1 0 1 0 1
 
 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
 first row sums must equal to 3, and the first column sums must equal to 4.
 The rules need to be applied to each row and column.
 How to get the new random sampling data? I have no idea.
 Thanks.
 

You could reorder your table by stepping through your table a column at a time, 
and for each column randomly deciding to swap the current column with a column 
that has the same column total.  Repeat this process for each row, i.e. for 
each row, randomly choose a row with the same row total to swap with.

Here is some example code which is neither efficient nor general, but does 
demonstrate the basic idea.  You will need to decide if this approach meets you 
needs. 

# I created a data file with your table (8x8) and read from it
sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE)
sites
# get row and column totals
colsums - apply(sites,2,sum)
rowsums - apply(sites,1,sum)
# randomly swap columns
for(i in 1:8) {
  if (runif(1)  .5) {
swapcol-sample(which(colsums==colsums[i]),1)
temp-sites[,swapcol]
sites[,swapcol]-sites[,i]
sites[,i]-temp
}
  }
# randomly swap rows
for(i in 1:8) {
  if (runif(1)  .5) {
swaprow-sample(which(rowsums==rowsums[i]),1)
temp-sites[swaprow,]
sites[swaprow,]-sites[i,]
sites[i,]-temp
}
  }
sites


Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.